SiliconANGLE theCUBESiliconANGLE theCUBE
  • info
  • Transcript
What data practitioners want for data sharing across clouds
Clip Duration 00:30 / January 13, 2023
Veronika Durgin, Saks | The Future of Cloud & Data
Video Duration: 18:23
search

(upbeat music) Welcome back to Supercloud 2, an open collaborative where we explore the future of cloud and data. Now, you might recall last August at the inaugural Supercloud event we validated the technical feasibility and tried to further define the essential technical characteristics, and of course the deployment models of so-called supercloud. That is, sets of services that leverage the underlying primitives of hyperscale clouds, but are creating new value on top of those clouds for organizations at scale. So we're talking about capabilities that fundamentally weren't practical or even possible prior to the ascendancy of the public clouds. And so today at Supercloud 2, we're digging further into the topic with input from real-world practitioners. And we're exploring the intersection of data and cloud, And importantly, the realities and challenges of deploying technology for a new business capability.

I'm pleased to have with me in our studios, west of Boston, Veronika Durgin, who's the head of data at Saks. Veronika, welcome. Great to see you. Thanks for coming on. Thank you so much. Thank you for having me. So excited to be here. >> And so we have to say upfront, you're here, these are your opinions. You're not representing Saks in any way. So we appreciate you sharing your depth of knowledge with us. Thank you, Dave. Yeah, I've been doing data for a while. I try not to say how long anymore. It's been a while. But yeah, thank you for having me. Yeah, you're welcome. I mean, one of the highlights of this past year for me was hanging out at the airport with you after the Snowflake Summit. And we were just chatting about sort of data mesh, and you were saying, "Yeah, but." There was a yeah, but. You were saying there's some practical realities of actually implementing these things. So I want to get into some of that. And I guess starting from a perspective of how data has changed, you've seen a lot of the waves. I mean, even if we go back to pre-Hadoop, you know, that would shove everything into an Oracle database, or, you know, Hadoop was going to save our data lives. And the cloud came along and, you know, that was kind of a disruptive force. And, you know, now we see things like, whether it's Snowflake or Databricks or these other platforms on top of the clouds. How have you observed the change in data and the evolution over time? Yeah, so I started as a DBA in the data center, kind of like, you know, growing up trying to manage whatever, you know, physical limitations a server could give us. So we had to be very careful of what we put in our database because we were limited. We, you know, purchased that piece of hardware, and we had to use it for the next, I don't know, three to five years. So it was only, you know, we focused on only the most important critical things. We couldn't keep too much data. We had to be super efficient. We couldn't add additional functionality. And then Hadoop came along, which is like, great, we can dump all the data there, but then we couldn't get data out of it. So it was like, okay, great. Doesn't help either. And then the cloud came along, which was incredible. I was probably the most excited person. I'm lying, but I was super excited because I no longer had to worry about what I can actually put in my database. Now I have that, you know, scalability and flexibility with the cloud. So okay, great, that data's there, and I can also easily get it out of it, which is really incredible. Well, but so, I'm inferring from what you're saying with Hadoop, it was like, okay, no schema on write. And then you got to try to make sense out of it. But so what changed with the cloud? What was different? So I'll tell a funny story. I actually successfully avoided Hadoop. The only time- Congratulations. (laughs) I know, I'm like super proud of it. I don't know how that happened, but the only time I worked for a company that had Hadoop, all I remember is that they were running jobs that were taking over 24 hours to get data out of it. And they were realizing that, you know, dumping data without any structure into this massive thing that required, you know, really skilled engineers wasn't really helpful. So what changed, and I'm kind of thinking of like, kind of like how Snowflake started, right? They were marketing themselves as a data warehouse. For me, moving from SQL Server to Snowflake was a non-event. It was comfortable, I knew what it was, I knew how to get data out of it. And I think that's the important part, right? Cloud, this like, kind of like, vague, high-level thing, magical, but the reality is cloud is the same as what we had on prem. So it's comfortable there. It's not scary. You don't need super new additional skills to use it. But you're saying what's different is the scale. So you can throw resources at it. You don't have to worry about depreciating your hardware over three to five years. Hey, I have an asset that I have to take advantage of. Is that the big difference? Absolutely. Actually, from kind of like operational perspective, which it's funny. Like, I don't have to worry about it. I use what I need when I need it. And not to take this completely in the opposite direction, people stop thinking about using things in a very smart way, right? You like, scale and you walk away. And then, you know, the cool thing about cloud is it's scalable, but you also should not use it when you don't need it. So what about this idea of multicloud. You know, supercloud sort of tries to go beyond multicloud. it's like multicloud by accident. And now, you know, whether it's M&A or, you know, some Skunkworks is do, hey, I like Google's tools, so I'm going to use Google. And then people like you are called on to, hey, how do we clean up this mess? And you know, you and I, at the airport, we were talking about data mesh. And I love the concept. Like, doesn't matter if it's a data lake or a data warehouse or a data hub or an S3 bucket. It's just a node on the mesh. But then, of course, you've got to govern it. You've got to give people self-serve. But this multicloud is a reality. So from your perspective, from a practitioner's perspective, what are the advantages of multicloud? We talk about the disadvantages all the time. Kind of get that, but what are the advantages? So I think the first thing when I think multicloud, I actually think high-availability disaster recovery. And maybe it's just how I grew up in the data center, right? We were always worried that if something happened in one area, we want to make sure that we can bring business up very quickly. So to me that's kind of like where multicloud comes to mind because, you know, you put your data, your applications, let's pick on AWS for a second and, you know, US East in AWS, which is the busiest kind of like area that they have. If it goes down, for my business to continue, I would probably want to move it to, say, Azure, hypothetically speaking, again, or Google, whatever that is. So to me, and probably again based on my background, disaster recovery high availability comes to mind as multicloud first, but now the other part of it is that there are, you know, companies and tools and applications that are being built in, you know, pick your cloud.

How do we talk to each other? And more importantly, how do we data share? You know, I work with data. You know, this is what I do. So if, you know, I want to get data from a company that's using, say, Google, how do we share it in a smooth way where it doesn't have to be this crazy, I don't know, SFTP file moving. So that's where I think supercloud comes to me in my mind, is like practical applications. How do we create that mesh, that network that we can easily share data with each other? So you kind of answered my next question, is do you see use cases going beyond H? I mean, the HADR was, remember, that was the original cloud use case. That and bursting, you know, for, you know, Thanksgiving or, you know, for Black Friday. So you see an opportunity to go beyond that with practical use cases. Absolutely. I think, you know, we're getting to a world where every company is a data company. We all collect a lot of data. We want to use it for whatever that is. It doesn't necessarily mean sell it, but use it to our competitive advantage. So how do we do it in a very smooth, easy way, which opens additional opportunities for companies? You mentioned data sharing. And that's obviously, you know, I met you at Snowflake Summit. That's a big thing of Snowflake's. And of course, you've got Databricks trying to do similar things with open technology. What do you see as the trade-offs there? Because Snowflake, you got to come into their party, you're in their world, and you're kind of locked into that world. Now they're trying to open up. You know, and of course, Databricks, they don't know our world is wide open. Well, we know what that means, you know. The governance. And so now you're seeing, you saw Amazon come out with data clean rooms, which was, you know, that was a good idea that Snowflake had several years before. It's good. It's good validation. So how do you think about the trade-offs between kind of openness and freedom versus control? Is the latter just far more important? I'll tell you it depends, right? It's kind of like- >> Could be insulting to that. Yeah, I know. It depends because I don't know the answer. It depends, I think, because on the use case and application, ultimately every company wants to make money. That's the beauty of our like, capitalistic economy, right? We're driven 'cause we want to make money. But from the use, you know, how do I sell a product to somebody who's in Google if I am in AWS, right? It's like, we're limiting ourselves if we just do one cloud. But again, it's difficult because at the same time, every cloud provider wants for you to be locked in their cloud, which is why probably, you know, whoever has now data sharing because they want you to stay within their ecosystem. But then again, like, companies are limited. You know, there are applications that are starting to be built on top of clouds. How do we ensure that, you know, I can use that application regardless what cloud, you know, my company is using or I just happen to like. You know, and it's true they want you to stay in their ecosystem 'cause they'll make more money. But as well, you think about Apple, right? Does Apple do it 'cause they can make more money? Yes, but it's also they have more control, right? Am I correct that technically it's going to be easier to govern that data if it's all the sort of same standard, right? Absolutely. 100%. I didn't answer that question. You have to govern and you have to control. And honestly, it's like it's not like a nice-to-have anymore. There are compliances. There are legal compliances around data. Everybody at some point wants to ensure that, you know, and as a person, quite honestly, you know, not to be, you know, I don't like when my data's used when I don't know how. Like, it's a little creepy, right? So we have to come up with standards around that. But then I also go back in the day. EDI, right? Electronic data interchange. That was figured out. There was standards. Companies were sending data to each other. It was pretty standard. So I don't know. Like, we'll get there. Yeah, so I was going to ask you, do you see a day where open standards actually emerge to enable that? And then isn't that the great disruptor to sort of kind of the proprietary stack? I think so. I think for us to smoothly exchange data across, you know, various systems, various applications, we'll have to agree to have standards. From a developer perspective, you know, back to the sort of supercloud concept, one of the the components of the essential characteristics is you've got this PaaS layer that provides consistency across clouds, and it has unique attributes specific to the purpose of that supercloud. So in the instance of Snowflake, it's data sharing. In the case of, you know, VMware, it might be, you know, infrastructure or self-serve infrastructure that's consistent. From a developer perspective, what do you hear from developers in terms of what they want? Are we close to getting that across clouds? I think developers always want freedom and ability to engineer. And oftentimes it's not, (laughs) you know, just as an engineer, I always want to build something, and it's not always for the, to use a specific, you know, it's something I want to do versus what is actually applicable. I think we'll land there, but not because we are, you know, out of the kindness of our own hearts. I think as a necessity we will have to agree to standards, and that that'll like, move the needle. Yeah. What are the limitations that you see of cloud and this notion of, you know, even cross cloud, right? I mean, this one cloud can't do it all. You know, but what do you see as the limitations of clouds? I mean, it's funny, I always think, you know, again, kind of probably my background, I grew up in the data center. We were physically limited by space, right? That there's like, you can only put, you know, so many servers in the rack and, you know, so many racks in the data center, and then you run out space. Earth has a limited space, right? And we have so many data centers, and everybody's collecting a lot of data that we actually want to use. We're not just collecting for the sake of collecting it anymore. We truly can't take advantage of it because servers have enough power, right, to crank through it. We will run enough space. So how do we balance that? How do we balance that data across all the various data centers? And I know I'm like, kind of maybe talking crazy, but until we figure out how to build a data center on the Moon, right, like, we will have to figure out how to take advantage of all the compute capacity that we have across the world. And where does latency fit in? I mean, is it as much of a problem as people sort of think it is? Maybe it depends too. It depends on the use case. But do multiple clouds help solve that problem? Because, you know, even AWS, $80 billion company, they're huge, but they're not everywhere. You know, they're doing local zones, they're doing outposts, which is, you know, less functional than their full cloud. So maybe I would choose to go to another cloud. And if I could have that common experience, that's an advantage, isn't it? 100%, absolutely. And potentially there's some maybe pricing tiers, right? So we're talking about latency. And again, it depends on your situation. You know, if you have some sort of medical equipment that is very latency sensitive, you want to make sure that data lives there. But versus, you know, I browse on a website. If the website takes a second versus two seconds to load, do I care? Not exactly. Like, I don't notice that. So we can reshuffle that in a smart way. And I keep thinking of ways. If we have ways for data where it kind of like, oh, you are stuck in traffic, go this way. You know, reshuffle you through that data center. You know, maybe your data will live there. So I think it's totally possible. I know, it's a little crazy. >> No, I like it, though. But remember when you first found ways, you're like, "Oh, this is awesome." And then now it's like- And it's like crowdsourcing, right? Like, it's smart. Like, okay, maybe, you know, going to pick on US East for Amazon for a little bit, their oldest, but also busiest data center that, you know, periodically goes down. But then you lose your competitive advantage 'cause now it's like traffic socialism. Yeah, I know. >> Right? It happened the other day where everybody's going this way up. There's all the Wazers taking. And also again, compliance, right? Every country is going down the path of where, you know, data needs to reside within that country. So it's not as like, socialist or democratic as we wish for it to be. Well, that's a great point. I mean, when you just think about the clouds, the limitation, now you go out to the edge. I mean, everybody talks about the edge in IoT. Do you actually think that there's like a whole new stove pipe that's going to get created. And does that concern you, or do you think it actually is going to be, you know, connective tissue with all these clouds? I honestly don't know. I live in a practical world of like, how does it help me right now? How does it, you know, help me in the next five years? And mind you, in five years, things can change a lot. Because if you think back five years ago, things weren't as they are right now. I mean, I really hope that somebody out there challenges things 'cause, you know, the whole cloud promise was crazy. It was insane. Like, who came up with it? Why would I do that, right? And now I can't imagine the world without it. Yeah, I mean a lot of it is same wine, new bottle. You know, but a lot of it is different, right? I mean, technology keeps moving us forward, doesn't it? Absolutely. Veronika, it was great to have you. Thank you so much for your perspectives. If there was one thing that the industry could do for your data life that would make your world better, what would it be? I think standards for like data sharing, data marketplace. I would love, love, love nothing else to have some agreed upon standards. I had one other question for you, actually. I forgot to ask you this. 'Cause you were saying every company's a data company. Every company's a software company. We're already seeing it, but how prevalent do you think it will be that companies, you've seen some of it in financial services, but companies begin to now take their own data, their own tooling, their own software, which they've developed internally, and point that to the outside world? Kind of do what AWS did. You know, working backwards from the customer and saying, "Hey, we did this for ourselves. We can now do this for the rest of the world." Do you see that as a real trend, or is that Dave's pie in the sky? I think it's a real trend. Every company's trying to reinvent themselves and come up with new products. And every company is a data company. Every company collects data, and they're trying to figure out what to do with it. And again, it's not necessarily to sell it. Like, you don't have to sell data to monetize it. You can use it with your partners. You can exchange data. You know, you can create products. Capital One I think created a product for Snowflake pricing. I don't recall, but it just, you know, they built it for themselves, and they decided to kind of like, monetize on it. And I'm absolutely 100% on board with that. I think it's an amazing idea. Yeah, Goldman is another example. Nasdaq is basically taking their exchange stack and selling it around the world. And the cloud is available to do that. You don't have to build your own data center. Absolutely. Or for good, right? Like, we're talking about, again, we live in a capitalist country, but use data for good. We're collecting data. We're, you know, analyzing it, we're aggregating it. How can we use it for greater good for the planet? Veronika, thanks so much for coming to our Marlborough studios. Always a pleasure talking to you. Thank you so much for having me. You're really welcome. All right, stay tuned for more great content. From Supercloud 2, this is Dave Vellante. We'll be right back. (upbeat music)