SiliconANGLE theCUBESiliconANGLE theCUBE
  • info
  • Transcript
Clip #3: The shifting sands of big data and where Snowflake & Databricks fit
Clip Duration 11:00 / May 27, 2023
Breaking Analysis: The future of AI is real time data…Meantime GPUs are making all the headlines
Video Duration: 47:02
search

From the Cube Studios in Palo Alto in Boston, bringing you data-driven insights from The Cube and ETR. This is "Breaking Analysis" with Dave Vellante. The era of AI everything continues to excite, but unlike the internet era, where any company announcing a dot com anything immediately rose in value, the AI gods appear to be more selective. Nvidia beat its top line whisper number by more than $300 million, and the company's value is rapidly approaching $1 trillion. Marvell narrowly beat expectations today, but cited future bandwidth demand driven by AI and the need to connect multiple accelerators, like GPUs and data center sized clusters. And that stock is up more than 20% today as well. Broadcom is up really 10% on sympathy with the realization that the connect centricity trend beyond the CPU is what Broadcom does really well. Meanwhile, other players, like Snowflake, which also narrowly beat earnings Wednesday and touted AI as a tailwind, got hammered as customers dialed down consumption, similar to the trends that we've seen in cloud momentum, but again, Snowflake's up today with the tech momentum.

Hello and welcome to this week's Wikibon Cube Insights powered by ETR. In this "Breaking Analysis," we look at the infrastructure of AI, examining the action at the silicon layer, specifically around Nvidia's momentum. And since much of the AI is about data, we'll look at the spending data on two top data platforms, Snowflake and Databricks, to see what the survey data says. And to do so, we have a special "Breaking Analysis" panel. We're changing our normal trend. We're here with John Furrier, who's in studio, and our good friend David Floyer. John, awesome to see you. Thanks for coming. You're in town this week for Red Hat Summit. And David, good to see you. We were talking earlier about your life. You said, what'd you say? We said, "How are you enjoying your retirement?" What'd you say? Busier than ever. >> Yeah, well, I'm glad you're still on top of the trends here. Let's get started. Two years ago, Floyer and I published this. Basically it was a roadmap of Nvidia's plan to take a massive chunk out of Intel's general purpose, data center dominance. And our positive outlook on the company's prospects specifically were related to its software expertise and the end-to-end capabilities. Not just the GPUs, but the tens of thousands of other components and the networking and the intelligent nicks and that whole stack. And guys, if you look at Nvidia's results, that vision appears to be coming to fruition. Nvidia's valuation is now around eight and a half or nine times that of Intel. And John, you called ChatGPT the web browser moment and Jensen calls it the iPhone moment.

Either way, Nvidia blew away its numbers, $670 million revenue beat. It cited its second half supply is going to be significantly better. It's really hard to get product today. But David, let me start with you. We were pretty much right on two years ago. Nvidia's value, as I said, is way, way surpassed Intel's and has been a massive catalyst for, not only Nvidia, but the entire industry. I agree, and what's driving it is parallel computing. They built the necessity to bring many, many, many more CPU cycles, simpler CPU cycles, into the game. And Nvidia have led with their GPU And when you look across the whole of all the other companies, you see people like Tesla have invested very aggressively. They've invested far more in neural networks. You see Apple have invested heavily, in again, neural networks. All of it parallelizing the hardware to bring much, much more CPU, simple CPU technology to bear. And people like Intel in particular, they have just fallen far, far behind. And I personally am not believing that Intel is in a good place at all and that they'll survive long term, but- But you felt that way for a while, David. I mean, we're going to come back and talk about the competition, but John, you've seen many movies. Yeah, what do you think, how is this one? Is it similar? Is it different? How is it different? Is this an incumbent trend? Is this a disruptive trend? Is it both? No, I think it's just a continuation of what David was saying. Nvidia was well positioned from day one, the GPUs. They invested in software stack and hardware, and they captured early imagination and growth of the crazy hyped up markets. First crypto and now AI. If you look at the crypto craze, they really pumped out a lot of stuff on the GPUs. So good for them. Now that's kind of in a nuclear winter. Now the shift goes to AI and then now it's all about cloud optimized GPUs. So we're starting to see that hyperscale next gen happening. And I think the issue with Nvidia is do they maintain the margin and increase more competition? Because you know what's the old expression that Jeff Bezos, your margin is my opportunity? I think there's going to be a competitive battle. And I think Intel will have a nice server dominance still, you're seeing Xeon and classic customers, their OEMs.

But in terms of these emerging markets, you got to have the connectivity at the chip level. Cloud optimized silicon, I think, is a real big game. And all the action will be at the physics. And AI is not just ChatGPT, this physical layer is huge. And I think this is going to be a massive wave. Reminds me of the OSI model. The physical layers got taken care of first and then it kind of stopped at TCPIP. But I think everyone gets all hot and bothered by ChatGPT 'cause it shows the ubiquity of the magic. But really, the action will be in the silicon cloud optimized. Jensen calls, he's saying the data center should be, is going to be called the AI factory. And basically he's saying there's going to be a massive shift in spending toward AI powered or accelerated computing, and obviously that's self-serving, but they're also very well positioned there. All right, The Cube, as you know, John and David, we have an awesome community. And I recently had a conversation with a really deep AI expert who told me this. He said, "The people doing AI love Jensen because he's a baller," in reference to like a basketball player, a person with game. "But if he really wants to democratize AI, he needs to lower prices. We need more competition. We'll see new GPUs from AMD. We'll use Intel." He said, "We use Intel GPUs even though they're suboptimal because we need other sources." So with that as a backdrop, I want to look at some of the silicon competition to Nvidia.

We can riff on this and some of the other firms possibly getting an AI boost. I mean, it's interesting when you think about, you know, that comment, Nvidia, you mentioned the margins. Their margins are in the high 60's and they have Intel like margins. Yeah. >> You remember Intel's monopoly, it's sort of following that same track and supply is low, so they can charge for it. David, we talked about Nvidia, AMD's going to have a solution for sure, Intel we mentioned. What about Arm? You mentioned earlier, Tesla and Apple, they do their own GPU designs with Arm. And you and I talked about this before, when we thought Nvidia was going to be able to acquire Arm, you felt like that eventually NVIDIA would or maybe should adopt a more Arm-like architecture. They have an Arm-like architecture, but really drive that. But we talked about maybe the DNA of Nvidia wouldn't allow that. What's your take on that now? Well, they have gone very hard on CPUs and they fitted the problems to CPUs extremely well. The CUDA software is brilliant. They have both the hardware and the software. When you look across the market, I'm always interested in what the personal computing is doing, the consumer computing is doing. And Apple have led there for a whole number of years with AI. They put in neural networks. They're very, very expert in in doing that. So I think neural networks will take a significant role and I think Nvidia will go in that direction as well.

They'll use both the GPUs and they'll put in more neural networks. If you're looking at people like Tesla, they've invested very heavily, again, in neural networks. They do not want to use GPUs, in particular, particularly for inference work. And both their central work and they're distributed in the cars, are very, very strong on neural networks. But the main key of this is bringing the cost per cycle of a CPU down, bringing much, much more compute power to problems, to AI problems in specifically. And it's not just ChatGPT, it's all the other parts of AI that will continue to drive the whole market towards, in my opinion, towards a dramatic amount of automation. Automation, for me, is the end product of what all of this is about. Ability to automate things, which we just couldn't do before. Yeah, hold that thought, because we're going to come back to that topic of automation. Alex, bring up that last slide again, will you? 'Cause I want to get John's input here because the other competition here, John, is you got the hyperscalers. AWS is, we know, designing its own chips. Google and Microsoft have announced AI products, GPU class products. Alibaba went with RISC-V recently. And we know China looms large, there's a lot of activity in China. What are your thoughts on the hyperscalers participating in this market? Well, I think they're going to be a factor. In fact, David Floyer wrote some research a couple years ago. I remember the hyperscaler talking about the physics aspects and Amazon's investment in the physics. James Hamilton, who used to run all that, now Peter DeSantis does, they've been doing this for a long, long time and they're not new to AI. And Microsoft came out with that big announcement, which was very scripted. It was very much to capture the minds, the hearts and minds of the press and the world. They did that with ChatGPT. They paid $10 billion for it. But they don't really have a lot of silicon experience. They have data center experience on running a large scale infrastructure from MSN background and going forward, and we kind of know that history. But Amazon Web Services has more silicon experience and has been working to optimize that, squeezing every ounce of productivity out of the equipment.

I mean, every small improvement on the physics will yield. So I think that's a big factor. We'll see if they can turn that into a service. I know for a fact, AWS is now aggressively putting generative AI out there in their messaging. They're trying to reeducate the world that they have been doing AI for a long time. So I think there's a consumer vibe here, but I think Amazon's well positioned. And again, I think another level of competition and companies are forming from the crypto bubble bursting. There are literally, these crypto mining companies that have essentially data center quality, hyperscale quality, GPU farms. So you're going to see that as potential new entrant that might come out of the left field. But again, it's going to come back down to who wins in these hype cycles? The hosting providers or the web, it was, you know, buying boxes, hosting them, paying for bandwidth, exodus communications, and others like that.

So here you'll see hyperscales play well and then upstarts fill the gap where there's needs. But I think there's going to be a surge back to data centers. If you can get some GPUs, you can rack and stack some GPUs and make that an on-premises value. You don't really have to over provision the GPUs. If you know what you need, you can just do it. So I think there'll be a pop on on-premise and a mix of hyperscalers for the elasticity. Thanks for that. David, do you think the example of Annapurna with AWS, and John, I'd be interested in your opinion here too, because AWS started with Intel, wasn't happy with the performance and I'm certain it wasn't happy with the price, 'cause it had to pay Intel class margins. But, then they still use Intel extensively, don't get me wrong, they're a big partner of AWS, but they went from Intel and then they brought in AMD. That didn't do what they wanted. That didn't satisfy them. So they started partnering with Annapurna, then they bought the company, then they did their Arm-based designs. They brought that in house. Do you think they can do sort of a similar approach with respect to competing with Nvidia to lower their cost, lower the cost per query, lower the cost of energy, is another big thing.

Or is it, Andy Jassy's, there's no compression algorithm for experience. Nvidia's been at this for 15 years since they sat around the Denny's trying to figure out the future. What do you think David? Well, the Annapurna acquisition was the best, was it 300 million, they ever spent. That was just amazing value. And yes, they have improved, they've separated out the control and put all of that on Arm and offloaded that from the processes, the Intel processes. So they have done that part of it extremely well. They need to do, they do have also some capabilities in AI. I think they are going to buy a lot of NVIDIA stuff, that's clearly the market leader. They will try and improve their own, but they will buy NVIDIA as well.

But as will all the other people in the marketplace, in the hyper marketplaces. I don't think Microsoft is yet up to speed to be able to produce such chips. I mean, the quality of the Nvidia chips, the size of them, the degree of integration, the Arm technology that they're using and the TSMC that they're using, is state of the art. And I don't think that, in my opinion, AWS will be able to avoid buying from the Nvidia as well. Well, and Jensen stressed in the call, he does many times, that they've got like 35,000 components on their system beyond just the GPU. But John, there's also a lot of startups in this game. And so could AWS potentially, like it did with Annapurna, pick up one of those startups and then change the game, like it did with using Arm. Maybe instead of going with a classical GPU, brute force approach, they can maybe use some kind of neural network? I'm not sure, Dave. I think David pretty much nailed it. It's hard to replicate, the barriers to entry is hard. I mean, the old expression, better be lucky than good sometimes. And the Annapurna acquisition, I think, was so much value that that doesn't happen very often. I think of companies like YouTube, you know, "Oh my God, they paid a half a billion dollars for YouTube." They did $41 billion as an example of one of the best, VMware get bought by EMC. Those only happen in a generation. Is there potentially a startup out there? Maybe. Can they get the magic formula? It's hard. So I think Amazon will continue to buy Nvidia, like David said, but also try to build it. They have to, I mean, if you're Amazon, you're not going to sit around wait for, have a dependency, because they're going to be at the beck and call of Nvidia in terms of pricing. One other thing, Alex, if you can bring up that logo slide again. I was catching up on my Warren Buffet last weekend. He sits there for six hours with Charlie Munger, drinking Coke and eating peanut brittle. And one of the things he said, they asked him, "Why did you sell TSM?" And he said, "It was a great stock. Great company, I love the company." But he's just concerned about the geopolitics. I mean, and we got China on this slide. We know China's really trying to get self-sufficiency. I mean, to the extent that China were to take over Taiwan, that is a huge risk. I mean, Nvidia people seem to be ignoring that now with the stock headed toward trillion dollar value. But that could be a wild card to watch out for. All right, let's shift gears and look at Snowflake's quarter and talk about where it fits in AI.

The reason we say Snowflake catches a cold is because they narrowly beat but we're very, really cautious about the outlook, citing more tepid consumption patterns relative to the past, like the cloud guys see. But they're still really strong by the way, in customers, but they're optimizing, they're doing things like reducing retention policies, which lowers storage costs, and that means lowers revenue. It makes queries run faster, so that means less compute and that means less revenue. But the other thing is CFO Mike Scarpelli, the poor guy was hacking all through the earnings call. He's really sick. But the thing I want to talk about is Snowflake's play to be the iPhone of data apps, or maybe it's better to say the App Store, if you will. They want to be the best platform to build data apps. Better than hyperscalers, better than Databricks, better than anyone.

And they've made some acquisitions, like Applica, which is a large language model. And now Neeva, which to me kind of supports the vision of Snowpark, bringing together search and generative AI and NLP. John, I know you're high on Databricks, who's by the way doing very, very well. We're going to show that in a moment. And David, you're not as optimistic on Snowflake and it kind of brings us to realtime data. David, let's start with you. What's your position on this? Well, I've said for many years that the key for companies is to do what Uber did when they introduced their software, which is to reduce the number of people in 10 years by 10 times. 10 in 10. That's my belief of what companies need to be able to do to fend off newcomers coming in with new architectures of designing companies around AI and around the ability to have very, very few people because of automation. And that automation will be distributed all around to the edges where necessary. In other places, it'll be centralized.

But that automation is what is key. And to provide automation, you have to have the transactional data coming straight to you and you have to have the analytic data there as well, as much as you can. And they have to share the same databases. They have to minimize the amount of time, elapsed time, between everything to be able to drive in real time the automation. So it's realtime automation and the systems that will provide that, that I think, in the long term, will win. Now in the short term, it does, in my opinion, Snowflake will do very well. But in the long term, I don't think their architecture, it will support what I've just been saying.

You have to have a much more direct from the, where the data sources are to the applications running very, very close to where those data sources are. And I don't see their cloud approach. It'll provide some goodness in that setup, but won't be the center of it. Now John, the Databricks, I want to sort of follow up on that, because basically I'm inferring from David, you're saying, the cloud, the remote cloud, that's not the place to do realtime inferencing at the edge, for example, but we want to maybe come back to that. But John, the Hadoop crowd largely was subsumed by Databricks, right? Spark and the cloud killed Hadoop and Databricks' doing very, very well. We'll show some data in a moment. And you're really high on Databricks. I know you've interviewed Ali Ghodsi many times. And you're sort of close to that Silicon Valley vibe. What's your take on all this? Well, David's right, I mean I think there's two things going on here in the data infrastructure world. One is, we went from 2010 Hadoop to Spark and then the stuff that we've been covering and reporting on where Databricks and Snowflake have taken advantage of the big data wave that actually played out. I think AI is going to absolutely change that. I have a radical view on this. I think it's, my thesis is, that the infrastructure opportunities to change the platform will shift. And I think that's going to put Databricks and Snowflake both on their heels relative to can they use their market power to either acquire or take new territory or take the right high ground in what AI will force with automation. David mentioned a few things, transactional data, I think the notion of databases will go away completely.

I think it'll be invisible. I think with automation and AI, developers will program data and they need access to data because the speed of insights will come down to not physics. Yeah, latency will be there, it'll be table stakes, moving packets from point A to point B. Time to insights will be what data's available to prompt and tune with automation. So this whole prompt engineering ChatGPT concept that people are now learning in the mainstream is essentially a call to its dataset. So if you don't have data available or data as code and manage all that compliance, you might be on the wrong side of the insights, which is going to be where the value is. So I think there's going to be an absolute platform re-shift on what will be available for data. And I think this idea of databases will just be automated. Data will be stored, it'll be managed for AI, not the other way around.

And where data is stored will be dictated by the developers and the applications, per David's point about transactional data. So I think there's going to be a complete script flip happening. And to me, I like Databricks because they're more open source. So the question is, does proprietary win or open source? So Snowflake's like the iPhone and Databricks is like Android, I guess, would be my weird example. But I think today those guys are, Databricks is going to do well 'cause they're open source and they do well with the cloud. Snowflakes does great, 'cause they got a great data cloud positioning, but it's basically a data warehouse. So I think there'll be a shift where they could be toppled over faster than what they did to data warehousing and Cloudera. So to me, it took, what, six, seven years? I think it could happen in two years. And see, I'm more- >> Less than two years. And I'm higher on Snowflake 'cause I don't see it as a data warehouse. I do see it as a data platform. They call it a data cloud. And I think for example, to your point about open source, they support Iceberg table formats. And you might say, "Well, that's a checkbox item," but if you actually look at that, here's my thinking, is if they can make building applications easier than anybody else, better and easier than the cloud, et cetera, and they can make queries run from Iceberg tables and they can do them faster and more cheaply than anybody else, then they can increasingly do that with other open source platforms. So that's going to be interesting to watch. If I can interject one thing, I was just watching CNBC, they're talking about the VCs and all these hot takes around AI. Most of 'em are just anecdotal, recycled pattern matching from other waves. But I think, when you talk to the smart VCs, the trend that's going on right now that a lot of the smartest people that I know are looking at is developer traction. If developer communities pick up a tool, and it could be just a simple tool, it doesn't need to be a platform, and there's consensus around making that faster and increasing productivity, those technologies get adopted very fast. It's almost like crowdsourcing of developers and some call it B to D, business to developers. If developers pick up a tool and it works well and it solves a problem, there's mass adoption that shifts the entire market. These are the kind of things that you're going to see. And I think with data, no one's actually built tools for programmers and app developers to manage data.

It's always managed by somebody else, a database administrator, someone in IT. So the data placement and the data management is handled by non-developers. That potentially is a wild card, Dave, and I think that to me is where, if you see something happen in open source where developers get traction, the app market could change significantly. That's where the refactoring could be. That's what David's point was saying is that that value shift could happen very quickly. And that's, David, that's Snowflake's big bet with Snowpark, that that is going to be the developer platform. So they've got, obviously they have a challenge to attract developers. I think they're going to try to redefine the concept of developers. But you had another point that you wanted to make, but please go ahead. >> Yeah, yes, if I may. I think in the short term that you are right, but if you are going to automate, the impact of data to inform people goes down. So automation is getting rid of all of the people and automating every aspect of that. Now, you need data in order to run the transactional system. And it needs a lot of data from all sources in order to optimize.

But it is an optimization as opposed to an essential requirement. It'll get as much data as it can to be able to run in real time. And real time is different in different industries. In Uber it's not subsecond. You've got seconds, but it's still pretty fast that it has to be able to do things. Other things will be actually real time to preserve the consistency. And if you look at the best ways of providing consistency, databases have been and always will be the best way of doing that. The difference is that they will be distributed, they'll be distributed round, they'll be in the center and they'll be all around the places where the data comes from and you'll want that database because that's the most efficient way to program.

The idea of the NoSQL database is where the programmer looked after it. That died years ago. And I don't believe that's going to be resurrected. You need databases. Are they the same databases? No. And there'll be lots of opportunity for new databases. But the fundamentals of a database and a single database, which does your transaction and your end user, I still think will be here for the next 20, 30 years. I mean, I don't see any way that you can avoid having that technology, which is so much, gives so much, takes so much off the plate of the developers. And going to the developers, I mean, the developers will use lots of new, there's so much opportunity to provide developers with tools.

I totally agree. And there'll be two types of development. There'll be the central development that runs the automation and then there'll be the massive amount of data coming from everything that will be there and will provide data for improving the AI and improving things over time. But I personally believe that the database will stay, it'll be a distributed database, and that will be how automation is put into place in reality. And if they don't do it in the next 10 years, they will be out of business, in my opinion. I still think the BI market's huge and I think that's really where- Yeah, no, it's- >> It's going to be a lot of money and I think that a lot of that data's still going to get into the data platform and it's going to be analyzed but you're right. I mean, it's not going to be, Snowflake wants everything to be in Snowflake. That's a prerequisite, so it can govern it, and that's its value proposition. And they're not going to be doing distributed queries unless something new comes along. And to both your point, I mean that's what Bob Muglia's working on, is this new type of database, so. All right, let's bring up the next slide, which looks at some of the ETR data on Snowflake and just get into that a little bit. I'll set up the slide, and if you guys have comments, that's cool. If not, we've got a couple of data slides here. What this shows is a progression over time of the net score granularity for Snowflake and net score is a measure of spending momentum, spending velocity, that's that blue line. And you can see how it's come down. Back in January '22, Snowflake peaked up in the 80 plus percent.

They were like 83, 84, 85% of net score, which is incredible. But you can see it's come down substantially. And that yellow line is called pervasiveness. That's the presence in the survey. So it's like the survey mind share or market share inside the survey, not in the specific market. The bright green, that's new ads. The forest green is spending 6% or more. The gray is flat spending. And you can see, that's the big trend, right? It's gone up, people said, "All right, the budget is flat." Slootman said in the earnings call, "The CFO is in the business," (laughs) which I got to laugh at, because I guarantee his CFO is in the business. Mike Scarpelli is all over their business. You know they got that covered. If you go back to that slide, Alex, please, the pinkish area, that's spending less and the red is churn.

And I really want to make this point. It's not like they're losing customers. This is very similar to the cloud guys, this is very sticky. So the churn is virtually non-existent, but that shift toward the fat middle of flat spend and some of the big customers spending less because of optimization, has brought this down a little bit. I don't know if you guys have any comments on this. I got some, actually, let's bring up the next slide too, and then you guys can comment. The next slide really talks to, it compares Snowflake and Databricks over time, and we added Streamlit as well, which is like a Python, you know, tool chain for data scientists, which is an acquisition that Snowflake made. But you can see what the progression has been for the last several quarters. Snowflake, as we said, is coming way down. Databricks and Snowflake, in terms of momentum, are starting to converge.

But look at the move that Databricks has made to the right. That's the presence, and the overlap of customers is also significant. I forget the exact numbers, but there's a very large number or percentage of Snowflake and Databricks customers, you know, the wrap guys is always, you know, a lot of people say, it's going to be easier for Databricks to get into Snowflake's territory. That's what they're trying to do with their Lakehouse. And others say, "Well I think it might be easier for Snowflake to get into the data science world." And you know, time will tell. But John, do you have any thoughts on all this data? Well, I think the churn data, the slide with the churn data is accurate. There's not a lot of churn. It's a market slowdown on spending, given the economic headwinds. But also when you have these big market movements, like AI right now that's hyped up, it tends to freeze the buyer market. People have to take a pause. Most of the cycles that I've lived through has had that kind of like, let's wait and see, plus the headwinds. We've seen the refactoring, not refactoring, cost optimization, I don't think Snowflake's going away. They've got a great business model and they were way ahead of Databricks. Databricks, however, has gotten traction in the open source community and their products have been getting more robust. They've been adding more announcements. They also rode the AI way by interesting Dolly, kind of a joke on DALL-E. So it's an AI product. And they're smart, and so I think they're just taking territory naturally because they have a fit with the market.

And Amazon, they got good traction with AWS and other clouds, they're obviously going multi-cloud with their approach. So again, Databricks and Snowflake, both different approaches, similar products, but just different approaches. And again, I think open's going to win, but that's my opinion. But I think the slide's natural. I think Databricks has got traction. David, I want to ask you, so on the earnings call, Raimo Lenschow, who, I really like him. He's a sharp analyst. He asked the exact question that I would've asked had they let me ask the question, so is that shift in data retention policies, and Snowflake cited one large customer shifted from retaining five years down to three years. Raimo asked, "Is that potentially a long-term trend?" I've seen this, I saw on Twitter the other day, I don't know, it was some developer, or maybe it was some Databricks snark. There was this big ocean liner and it said, "This is all your Snowflake data." And then it had a little tug, you know, a little speedboat, "And this is how much data you actually use." The implication being we're spending all this money on Snowflake, but we're actually not using it that much. So I thought Raimo's question was a good one.

And I think Slootman's answer was "No, this is a short term trend," or, "It's a trend that will end." He didn't say it's short term. He said, "This is a trend that will end. We've seen it before." Now you and I have talked about this for a long, long time. It's easier just to leave all the data in there. If you have a budget constraint, now you got to go recapture wasted space. What are your thoughts on that, on on two things, this data overall and then specifically that question that Raimo asked? Well, I think he was asking much more on the short term and I think exactly as you said. There will be reduction of data if you run out of money to hold it, et cetera. The most- >> Yeah, but what about long term, what about long term? Do you agree with Slootman? We'll come back. >> Well, that's what I'm just coming to. >> Yeah, sorry, go ahead. So the most valuable data is the data that's freshest. The value of data goes down dramatically as it ages. So just keeping data for data's sake is not particularly valuable. And the best way of doing it is to extract the value of the data as quickly as you can. So when you look, for example, at Tesla, in the car, they extract all of the value of the data, and then they hold it for 10 minutes, just in case there's an accident and they need to keep that data. But after that, it's thrown away. And if you think about it, the thought of having to capture all of that data, it would be mind blowing.

I mean, it would just be so astronomical. So they throw it away. But they've captured the raw part of that data. So for me, when you're doing automation of any sort, most of that data is going to be thrown away and you've got to extract the data very close to the source of the data and then that will reduce the amount of data. So I am not a fan of the thought that the data is growing so rapidly and there's going to be large, large sources of it. It is at the moment because that's the way that data has been stored, it's had to be stored, but I think they're going to be far more efficient ways when you come to automation of just keeping the data for a certain length of time and then you've extracted the data and the amount of data that will be stored will go down over time. That's, sorry, won't go down. Yeah. Oh yeah. >> It won't go up the exponential curve that it's doing. Well, we'll see. We'll see. 'Cause I think, my bet would be that you're right, that there'll be maybe a larger proportion of the data that we would've normally kept, is going to be ephemeral, but I think it's going to be so much data created that actually, the amount of data is actually going to go up exponentially in that (indistinct). >> It's not going to go up at the same rate it's going up at the moment. I think we'll see, I think the curve is going to bend even steeper. We disagree on that, but you're smarter than I am, so we'll see. >> Well, no, no, the need for the data, I think what I see his point is, the need to store all the data in some use cases is not needed. The Tesla's a great example. >> Yeah, but it's easy. Yeah, but we're in a whole new industry now, and this is going to, again, this is the radical view. If you have a data language like OpenAI, which is massive amounts of large language model, and with the foundational model stacks, I believe there's going to be hyper data scalers, for lack of a better description. There will be some companies who will emerge who will be the cloud-like scale for data, and those might be brokers, that might be a new service, but at some point, someone has to have that for foundational models. Because what we're seeing with the foundational models, the big get bigger and big get richer, right? So you're going to have, like Amazon, as Amazon becomes so big, there's only three hyperscalers. So I think there's a model where an application, I don't want to have to create a data infrastructure. I'm going to be using my data for real time or whatever the use case is. I'll throw it away and then have my domain data, which is proprietary and more value, maybe the fresh data, as David says.

And I'll program that into a large language model, which I could either own or have someone else do for me. So I think that might be an interesting scenario. Now, I'm not saying that exists today. I'm just saying that the debate going on right now is, there is value in large data sets for AI. Historical data, patterns, all kinds of stuff, training. And if you look at the foundational model stacks right now, you have financial model ops, FinOps, FMOps, underneath the foundational models and the tooling sits on top of it. So the middle layer is the training and ingestion, then the foundation models develop, then you have the tooling that drives the apps. So that's a completely different animal than we're seeing in anything out there today. So, you know, you got to, who knows how it's going to play out, but there is an argument that says, "Hey, why not, why can't there be a data cloud?" Like a legit data cloud, not like a Snowflake or Databricks.

I have no idea if that's going to happen, but I could see that being valuable. >> Wait, why is that not a legit data cloud? You're saying because it doesn't incorporate the real time- >> No, David's teasing out a great point. I don't want to pay for storage if I don't need it, but if someone else does it for me, they can build the CapEx and I'll use it as I go and almost donate into like an open source model. Oh, I see. Okay. I want to end, speaking of layers, I want to talk about the future of data apps and automation. And this is a slide that George Gilbert created and he and I are going to take another stab at this actually prior to the Snowflake Summit and Databricks Summit, we're going to be unpacking those two companies again. But basically the premise here is that applications are moving from a process-centric world to data-centric, and the point being that increasingly, instead of data residing in silos within, sort of, the data really being attached to the business process, flip that, the business and the process is going to be embedded in the data. So just sort of different look in the model. And this model here is, it's got the apps underneath, the enterprise apps underneath.

It's got this data layer, and we use Uber as an example. You've got riders and drivers and destinations and ETAs, you've got maybe products that you're delivering, et cetera. And those are all data elements or data products that have coherence through a semantic layer and then move up the stack, and that's the apps. And this world is a different world. It probably requires new databases. If you look at what Uber's doing, they're using Google Spanner and they've built a layer on top of that to basically not have to make the trade offs that they had to make prior to 2016. And David, you have been making a point all through this session about automation, that that is the key driver of value creation. So pick it up from there and explain what you mean. So yes, I mean Uber is a really classic example. You look at the number of people in Uber compared with their revenue, it is, the revenue per person is out of this world. It's unlike any other company in existence. So that's amazing that they have done that and they have automated, completely. They've taken all of the data from the cars, from the people, about the streets, what the road conditions are, and they're able to use that to optimize and run the whole company. David, David, let me interrupt you. Uber's, I just Googled it. Uber generated $971,000 per employee in 2022 compared to about 600,000 in 2021. That is astounding, I mean, a typical software company, you're lucky if you're at 225, $250,000 per head. >> Absolutely. I mean it's millions of people. So, that's the way they've done automation and that's the model that everybody will have to follow if they're going to keep their own current company solvent, because in my view, the usage of these AI tools, the usage of the realtime data, will bring in competitors and startups, which will have just a much, much easier way of getting to that scenario, rather than have to migrate from their current systems to that scenario.

So that's, to me, the real, real threat long term. If they don't do that 10 in 10, tenths of the people in 10 years, they will be really exposed to a lot of startups coming in. And if you think about Elon Musk, I mean he has come into the car industry and he has revolutionized the cost of doing things there, and he's produced software cars as opposed to hardware cars. And he's done the same with SpaceX. So again, you look at his ability to drive productivity in those two industries, that will have to be done by everybody for them, in my opinion, to survive. And that does mean to say, that there's a lot of other data other than the realtime data, but that's the one that's valuable. That's the one you have to capture. That's the one you make your decisions on in real time, which is what Uber has done. Well, and the last question I asked on that previous slide was could this be a disruptor or can incumbents capitalize. I mean we're at Dell Tech World this week, you were at Red Hat Summit, which is obviously owned by IBM. Both of those companies are going to be able to take advantage of AI, Snowflake's going to be able to take advantage of AI. So is ServiceNow, so is Oracle, Salesforce, et cetera. But it feels like, John, that there's going to be some new model, like keyword search, which you were very heavily involved in in the early days, which people maybe poo poo or overlook and it then becomes a dominant play in the industry. I'll give you the last word. I mean, I think what David's saying, is a whole new paradigm shift's happening. Platform replatforming, refactoring, startups are going to come out, new unicorns will be born in this wave. And it comes back down to simplifying things, making it easier and reducing steps it takes to do stuff. And the web I think showed that the best. It came out, it was laughed at early on. Things were missing on it. But those incremental improvements from what was a nascent market that then went mainstream. And when it went mainstream, those startups that worked on it captured the value and the rents. So I think AI's going to have a lot of that, but the big players are going to be also involved and they'll be the beneficiaries of that value. So I think it's going to be more like the web and the internet than the iPhone. Hey John and David, thanks a lot for coming on "Breaking Analysis" today. This was a great session. David Floyer, it was so good to see you. You've been such a good friend over the years and wishing you all the best as always. Thank you. All right, thanks also to Alex Myerson, who's on production and manages the podcast, Ken Shifman as well. Kristen Martin and Cheryl Knight helped get the word out on social media and in our newsletters And Rob Hof, who's our EiC over at siliconangle.com. And don't forget, check out all the videos at thecube.net. These episodes are all available as podcasts. All you got to do is search "Breaking Analysis" podcast. Appreciate you subscribing. I publish each week on wikibon.com and siliconangle.com. If you want to get in touch, email me directly, david.vellante@siliconangle.com. You can pitch me, you can DM me @DVellante, you can hit me up on LinkedIn, on our posts. If you got a good pitch, I'll definitely respond. If not, don't take offense. We got a lot of them. Please do check out etr.ai. They got some great survey data in the enterprise tech business. This is Dave Vellante, for our guests, for the Cube Insights powered by ETR. Thanks for watching and we'll see you next time on "Breaking Analysis." (soft upbeat music)