Transcript: "Investing in AI, with Tomasz Tunguz"

The text below is a transcript of audio from Episode 28 of Onward, "Investing in AI, with Tomasz Tunguz."

Disclaimer: This transcript has been automatically generated and may not be 100% accurate. While we have worked to ensure the accuracy of the transcript, it is possible that errors or omissions may occur. This transcript is provided for informational purposes only and should not be relied upon as a substitute for the original audio content. Any discrepancies or errors in the transcript should be brought to our attention so that we can make corrections as necessary.

---

Ben Miller:

Hello and welcome to Onward, the Fundrise Podcast, where you'll hear in-depth conversations about the big trends affecting the US and global economies. My name is Ben Miller. I am CEO and co-founder of Fundrise. My guest today is Tomasz Tunguz, one of the greatest VC investors of the next generation of venture capital. After getting a degree in engineering and working at Google, he joined Redpoint Ventures, a storied venture investor, where Tomasz then spent the next 15 years backing some of the world's top startups.

Tomasz recently founded his own venture fund called Theory Ventures, raising $230 million for his first fund. We believe he's one of the greatest rising stars in venture capital today. The Fundrise Innovation Fund invested $5 million in his new fund. Before we get started, I want to remind you that this podcast is not investment advice, is intended for informational and entertainment purposes only. Tomasz, welcome to Onward.

Tomasz Tunguz:

Thrilled to be here. Thanks for inviting me, Ben.

Ben Miller:

I have a lot of topics I want to cover ranging from AI to venture capital, so let's jump right in here.

Tomasz Tunguz:

Let's do it.

Ben Miller:

Most people outside of tech aren't going to be that familiar with you, even though inside the tech industry you are very well-known. You've been in tech for almost 20 years, and you were venture capital for about 16 of those years. You've identified and backed seven startups that went on to become billion dollar plus unicorns. That's pretty good.

Tomasz Tunguz:

I've been very lucky to partner with some great founders.

Ben Miller:

I'm interested in some of the stories. Let's talk about venture capital for a bit and let's talk about AI because you know a lot about both.

Tomasz Tunguz:

Okay.

Ben Miller:

I'm interested in looking back at what's changed over time and how venture capital is really very different now than it was when you started in the industry. How do you think about the biggest changes to venture capital and why did those changes happen?

Tomasz Tunguz:

Great question. The venture capital asset class has grown from about 8 billion to about 250 to 300 billion in 12 years. That's a pretty huge change, and about 80% of the increase is from what you would call non-traditional venture capital. What we mean by that is venture capital grew up as an industry where it was three or four people around the table, then they put in a little bit of money, I'll put in bit of money, and then that led to a partnership. Those partnerships have been formalized and now we have the equivalent of investing corporations.

That's been a really big change. Just the scale of the industry has grown enormously. I would say there's two major drivers of that. The first is the Fed and quantitative easing, just many, many more dollars. As investors chase yield longer duration, private equity, which is what venture capital is, has really good returns over the long term. And in a zero interest rate environment, pretty good place to put money. The second dynamic that's happened is the cost to go public for a startup has increased meaningfully.

Amazon went public after three years. Microsoft went public very, very early on. I think four years after its founding. Today, the average startup goes public after 12 years. And when it goes public, it spends about 10% of its revenue on the costs to go public. It typically goes public at about 150 million for a software company. It typically goes public at about 150 million in revenue, and it'll spend something like 10 or 15 million on lawyer fees and audit fees and accounting fees. As the costs have gone up, the total number of privately traded software companies has actually gone down with time.

What's happened is instead of going public to year four, we're now going public at year 12. Well, the money to finance that business to help it grow for those eight years has to come from someplace. It comes from the private markets now where it didn't 20 years ago.

Ben Miller:

Wow. Well, the second part speaks to the reason why Fundrise exists, because then all these people who are not investing through institutional channels or super high net worth investors, not accessing those tech companies. Maybe interested in some thoughts around that down the road. But the way I think about venture returns at the really high level is you have these explosive waves that have hit the technology industry every 10 or 15 years.

You go way back, the mainframe, the PC server, then you go from that PC to internet, for internet to mobile, and mobile to cloud, and now cloud to AI. You've lived through some of those more of them as a consumer and then some of those as an investor. How do you think about those waves and playing within those waves, or do you think about it a different way?

Tomasz Tunguz:

I think you're absolutely right. You do have these large secular waves and those waves are the oxygen for venture capital because everything changes. When an ecosystem changes, when software buyer or infrastructure buyer behaviors change, startups can thrive. Because in an ecosystem that's long-dated and the existing winners are known, no one gets fired for buying IBM, that old expression, that's the domain of the incumbent. That's the domain of the big public companies who have more capital, better relationships, great distribution.

When there's something new, it's the young startups who don't have anything to defend that aren't subject to that innovator's dilemma, they're the ones who figured it out first. It was the startups who built the first native mobile apps. It was Uber and Foursquare who created these applications that actually used location on the mobile phone, and it was OpenAI, a startup that actually basically commercialized a lot of the technology that was coming out of some of these big companies.

What ends up happening? There's this wonderful woman named Carlota Perez who has this great book on innovation cycles, and she cuts them into two, and I'm altering this a little bit. She would argue that these cycles happen on the course of 50 to 150 years, and venture capitalists have taken her theory and said, "No, we're going to apply this to 15 years." With that caveat, I'll say what ends up happening is there's two phases. There's an infrastructure phase where a lot of money is spent on connecting everybody to the internet, and you have all these big companies.

Quest was a big one. And then after that infrastructure phase is complete and everybody's connected or everybody has new AI chips, then there's this commercialization phase effectively. The commercialization phase is when software is built and consumers start using it at scale. It's important at a high level to think about where we are. Crypto, I would say, or Web3, that's gone through an infrastructure phase. We have new kinds of databases, new ways of moving money. In crypto, we're basically in the infrastructure phase.

What that means is we've connected computers in a novel way and we're still in a place where we haven't yet figured out exactly what to do with it yet. In AI, Google and other companies have been developing AI for an awfully long time, and we're at a place where a lot of machine learning's been commercialized. Look at call transcription if you're a salesperson. Zoom will listen to you and record it, or we've all been on call trees where you dial into AT&T and they'll help you navigate. There's some AI there.

We're in this next phase of commercialization now where these new technologies are being rapidly adopted and changing the way that we work. You're either in the infrastructure phase or in a commercialization phase. Understanding which phase you're in should inform where you invest. What I mean by that is when you're in infrastructure phase, you really want to solve an end-to-end problem for a user because most buyers don't understand how to leverage the technology or what to do with it.

You need to show them, okay, what is the ultimate value of this technology? And then when you're in a commercialization phase, what ends up happening is buyers have used the product or used the new technology a lot and they say, "I want to modify this particular layer of the stack or this particular technology in order to suit my business better." It's like a bundling and unbundling is another way of thinking about it. You want to bundle in the infrastructure phase and then unbundle in the commercialization phase.

Ben Miller:

The cloud went through infrastructure phase. Cloud now you're saying is in the commercialization phase? I'm trying to give some examples to help people imagine what you're saying.

Tomasz Tunguz:

Yeah, so let's make an example of it. The cloud infrastructure phase, look at Salesforce. When Salesforce started, no one was in the cloud. The clouds were very small. Most enterprise buyers were worried about their security, and they wanted to manage their own software on their own infrastructure. It took about 15 years, and now we're at a place where about 35% of all spend is on the cloud. That's a lot of money. That's about $500 billion is spent on cloud today from zero 15 years ago.

But in order for that to happen, Google and Amazon and Microsoft needed to spend a lot of money to build out that infrastructure, and they did that over the course of the last 15 years. Then we saw this wave of all these different software companies being built on the cloud. The Zendesk IPO or Elastic's IPO or Confluence IPO, those are all cloud companies. Snowflake, fastest growing enterprise software company in history, built on the cloud, built out in AWS, and others.

We've transitioned from the clouds building up the infrastructure to support. Even Netflix, you could argue, was built because Amazon existed. Now we're in a phase where people are just making money on top of those clouds. The cloud is broadly accepted as being the standard, which wasn't the case 15 years ago.

Ben Miller:

I think I read something you wrote that basically you said the cloud providers were about a $3 trillion market cap, and then the applications on top of the cloud are also about $3 trillion.

Tomasz Tunguz:

That's exactly right. But there are only three clouds and they're the top 100 B2C and B2B apps. But if you were to rewind, take those numbers and rewind say 10 years, maybe those cloud infrastructure providers were worth one and a half trillion or 1 trillion. The cloud application companies would've been worth a lot less, but today they're the same.

Ben Miller:

That analogy is a good one as you think about AI. Could you walk us through the AI as infrastructure and then the AI as application?

Tomasz Tunguz:

Anytime you're building software, there's three layers, the three layer cake. There's what we call infrastructure, and that's literally the chips. And then there's memory and the hard drives and the networks and how we connect the computers together. The next layer is the platform. The platform is where developers spend all of their time. That's where they write code to build software that we use. And then the software itself is the application layer. That's true in classic software as much as AI.

In the last say two years, most of the innovation in AI has happened at the infrastructure layer. Nvidia has had one of the most incredible runs in the public markets in history because they have chips that everybody needs to use, and you need them for two different reasons. One is to train a model, so make a machine learn how to do something, and then the second is when you ask the machine to do something, it actually does it. One is training, the other one's called inference. A lot of the innovation has been there.

Ben Miller:

You're really good at explaining this. You said one is training, the other one's inference.

Tomasz Tunguz:

That's right. Inference, yeah. Historically, most of the money has been spent in training. You look at the first OpenAI models or the models coming out of Google or Facebook, those models might cost 50 to $500 million just to train, which is a huge capital expense. And that's a combination of both buying the chips, which are called GPUs, graphics processing units, and the other combination is the power, the electricity and the cooling for these things because it produce huge amounts of electricity.

And then the second part is every time you ask ChatGPT a question, that uses 10 times as much energy as a regular Google search, according to Google.

Ben Miller:

10 times as much compute.

Tomasz Tunguz:

10 times as much compute to do an LLM-based search as a standard Google search. Now, Google has spent 20 years optimizing the hardware and the networking and all this stuff to drive that cost down, but still, Google and Amazon and Microsoft can't just switch all those searches over from classic search to LLM search because there aren't enough computers, there are enough electricity in the data centers.

This is why Nvidia stock is going up so much because all of these cloud companies are saying, "We need more and more GPUs. Because as the usage goes up, we need to be able to serve them."

Ben Miller:

We're infrastructure phase to the application phase. We've definitely been in an infrastructure phase of AI. Take us forward. What does the transition look like? What's happening?

Tomasz Tunguz:

Okay, so now we have this new layer of infrastructure and most software engineers have not written software on this infrastructure before. There's a very small group of machine learning engineers who are specialists in these fields who've played around with these models for two decades, but a regular software engineer, of which there about 25 million who build websites, would really love to use this technology, but didn't go to grad school to study machine learning.

Now there's a wave of companies that are simplifying or building abstractions or layers on top, where if you are a great front end software engineer, an engineer who builds websites or builds web application, you can start to use these products. You can start to have predictive text in the text box in the CRM, or you can start to listen to phone calls and take that and summarize with the output of a phone call and create action items out of a meeting. That's what's happening at the platform layer.

At the same time, we're starting to see the very first applications at the application layer. The big innovation here, the big change is if you think about an enterprise, like a big company, that company has a lot of data. The data's going at about 30% per year. Maybe 15 to 20% of that data is structured. The way you can think about structured data is you could open up an Excel spreadsheet and you could see it. But most of the data is in unstructured, it's in text files and images, videos all over the place.

The big wave in the last 20 years has been making money by selling tools for the structured data. Snowflake and 10 others like Looker and the BI products. The next wave is about that unstructured information. How do you make value from that? These machine learning models are supreme at that task. The way that we call it is you're fracking your data.

Historically, it's been very expensive to get value from that unstructured Information, and now we've figured out a way of mining that data to get value from it, where you can take 100 different meetings notes and then summarize them and put them into a memo. That would've been impossible without a lot of humans even 24 months ago or 12 months ago.

Ben Miller:

Oh man. We're up to the heart of it here. I'm going to try to go over that again and parse it some more. The entire data industry talks about structured versus unstructured data, and the bottom line is structured data means it's in a table, a spreadsheet, essentially rows and columns, that's structured data. And then unstructured data are like the conversation you and I are having.

It's words. It's oral. It's written, images, pictures. Basically anything that's in your computer or hard drive that's not in a spreadsheet. Even a spreadsheet is not very well-structured data. It's just an Excel spreadsheet. Difficult data actually to use.

Tomasz Tunguz:

That's right. Yeah, they can be a big mess, be a cool bunch of texts. Now we're in a place where these machine learning models, a big advance. You can take a legal document and a computer can understand it. You can take series of videos and self-driving car wave. You can identify when there's a pedestrian. That's unstructured data. You can take music, you can take art, and create new art from it. This is the really big wave where computers can take a large amount of videos or art and then learn different parts about it and try to replicate.

We just met a weather forecasting company. We didn't know much about this industry, but today you can think about, okay, the atmosphere, cut it in half. The top half when we gather weather data comes from satellites, and then the bottom half, we send up weather balloons and airplanes that take off have sensors. We take all of that data and put it together. It takes about six hours to get that data. And then we send it to two supercomputers that the government operates and they use math in order to predict the weather.

Well, all that entire process takes about 12 hours today. The computing takes a really long time on some fancy computers because they're literally calculating, okay, this pocket of air is moving this way and then this way and this way. It's a branch of math called partial differential equations, which doesn't matter, but it's really complicated. The new way of doing it with these large language models is you don't tell the computer anything about the physics of temperature or pressure or wind or rain or snow.

You just give it a bunch of data and then it figures it out. The net result is there are startups now that are building weather prediction systems that are 10,000 to 40,000 times more accurate. And instead of taking 12 hours to produce those forecasts, it's six hours plus a couple of minutes. They've basically halved the time of those forecasts. And instead of predicting 10 days, they can get to 15 days with a greater range of accuracy. Think about it, it's a huge amount of construction information.

You can take weather reports off Facebook. If it's in Florida, [inaudible 00:17:56] because of the hurricane that's there. You can take structured information which are just the temperature within a particular region, wind information off a bunch of sensors on planes. Put it all together, mix it up into a big soup, and then it turns out to be something incredible at the end because of these machine learning models.

Ben Miller:

There's so many directions I want to take this conversation at the moment. How do you think about the patterns you're looking for and the opportunities that are interesting? Because you just said weather, that's probably a multi-billion dollar opportunity. AI, it touches so many things. What's your approach to trying to figure out where to focus on the greatest opportunity?

Tomasz Tunguz:

Let's put a little history here. We talked about the cloud. In the last 15 years, if you were a cloud company, a startup, what you would do is you would find a piece of a bigger suite, a piece of a bigger product and make a really incredible single workflow. You would take a look at SAP. I'll take out the accounts payable module, so bill payments module, and I'll make a great billing product, or I'll look at Siebel and I'll make... This is the way that Salesforce started. They didn't build a CRM to start.

They said we're building Salesforce Automation, which means there's one very particular narrow workflow and from there they grew. That's the way that they took on the incumbents. That's the way they took on Oracle and Siebel of the day. That is not the way that this wave will play out. What I mean by that is I don't think you can go after Microsoft today and compete with them on Microsoft Word using a large language model, because Microsoft's already there. In a lot of the waves that we talked about in the past, startups were the ones to figure it out first.

In this wave, startups are actually on their back foot. They're behind. Microsoft is releasing updated versions of PowerPoint. GitHub, which they own, which is where developers code or at least save their code, they released a product that can code on behalf of developers. Now, half of code that developers commit at GitHub is machine generated. You can't run that same playbook, you can't run that same strategy that we've had where you go after a particular niche of an incumbent and compete because it's very likely the incumbent's already there.

The main reason the incumbents are already there is after they've developed this technology, it's actually relatively easy to get to a pretty good solution. Well, we built machine learning apps at Theory, and a lot of the times it's 10 or 15 lines of code because they're so powerful that they figure out a lot of things on their own. If that's not the right playbook, then the question is, well, what is the playbook? We were just talking about how there's a shift from structured to unstructured data.

We think that's where the opportunity is. A couple of the big waves for the first startups, one is legal. All of the lawyer's information is unstructured. It's own Word docs. We've seen five or 10 different companies that are helping lawyers draft different kinds of agreements based on previous agreements, which makes sense, and that will change that Industry. Weather it would be another example of it where the incumbents aren't there. There's a lot of unstructured information where you can pull data that you couldn't have before and analyze it in a useful way.

We met one company that... I'll make one last point and then I'll pause, but the real power I think comes from chaining these models together. We met one company that sells to restaurant distributors. If you go to a restaurant, they've got potatoes and green beans and tomatoes, and they buy that from another company. A distributor aggregates all of them and sells it to the restaurant. This company, they sell to the restaurant distributor. They crawl the internet for the menus of restaurants.

They use AI to say, okay, what's in this tomato soup? Well, there's a potato and there's a tomato and there's an onion, and they aggregate that across the menu. They figure out, okay, this particular restaurant uses a lot of tomatoes. And then they look at it across an entire neighborhood and it's like, well, turns out this particular part of Washington, DC, they consume a huge amount of tomatoes. They take that information. It goes to the procurement team. The procurement team at the distributor then buys tomatoes and bulk and can offer them at a 30% discount.

And then that goes back to the sales team and the sales team calls Ben's restaurant and says, "Ben, we can offer you tomatoes at a 30% discount if you switch to us." They record that phone call, and then all that data just goes back and spins and spins and spins, and it's meant to make the sales team more efficient. What you've done there is you have a crawling model, crawls the internet, figures out the recipes, that's one model. And then there's a call listening model, and then maybe eventually there's procurement model.

Now you've got three different types of machine learning in one single product, and the combination of those three things really could unlock a ton of efficiency.

Ben Miller:

What you're focused on is the substance of AI. You really know difference between reality and vaporware, but there's so much hype right now in AI. The amount of hype is just extraordinary. There seems like there's two ways to play it. These are figuring out where the value will accrue or selling picks and shovels.

Tomasz Tunguz:

Whenever you have a big wave, there's lots of excess capital that comes. I think if you're a business owner today, you need to ask the same question as you've always asked, which is how much time will this save me, or how much more money can I make as a result of this software?

I think it's easy to look at an AI enabled software and forget some of those principles because the technology seems really cool. But in the end, there are only two reasons people buy software. Either you make more money from it or you save costs. That's the way that we think about it. At the highest level, that's the question that we're trying to answer when we look at these startups.

Ben Miller:

Your new venture company is called Theory, and it seems like you actually have those two approaches in your Theory. The data infrastructure is the picks and shovels because you basically had to have good data infrastructure in order to do AI, and then you have AI. Your whole Theory of Theory is that these matter. Maybe we could talk about those two thesis because I think they really help to understand what matters.

Tomasz Tunguz:

We talked about the three layers in the layer cake. Let's simplify and just talk about two. We'll talk about infrastructure and then we'll talk about the applications. We said in the last wave of cloud, there's three products that are worth about a couple trillion, and then there are 100 products that are worth a couple trillion. We think the same thing happens in AI. Because if you're building chips or you're offering infrastructure, you need hundreds of millions, if not billions of dollars to do that effectively.

Very few companies can do that. A lot of the big publicly traded clouds, they're already there. Google has been making dedicated chips called TPUs for a while. Amazon has their own. Facebook has their own. That's a tough place to play. The other reason it's a tough place to invest is the number of acquirers is actually surprisingly limited. Let's say you're building an infrastructure company. Well, Google has invested 500 million in Anthropic, so they've picked a partner. Microsoft has invested 10 billion in OpenAI.

They've picked a partner. Oracle has invested a couple hundred million dollars into cohere. You can keep going. Databricks bought Mosaic. You can keep going down the list. Apple's unlikely to buy. Netflix is unlikely to buy. Facebook has an incredible team, unlikely to buy. The number of very, very large acquirers is actually smaller than you might think or might anticipate at the beginning. The other dynamic here is that the federal government's been much more aggressive on anti-monopolistic acquisitions.

Google and Amazon and Facebook are under much more scrutiny when it comes to these acquisitions. That may also limit the total value of some of these sales, or it may force longer term partnerships as opposed to total acquisitions. Those are some of the reasons why I think investing at the infrastructural layer is challenged. It's not to say that there will be spectacular outcomes there, it's just I am not yet convinced that it's broad based.

At the application layer, on the other hand, we talked about some of these vertical or niche or industry specific applications, and there the service area to innovate is enormous. It's hard to imagine. There's sewer inspection companies using machine learning in order to understand the health of sewers. It boggles the mind what you can do with some of this data.

We're spending a lot more time at the application layer and much more on the use cases of machine learning that are not immediately apparent because the big horizontal companies are the ones that will tackle those first order opportunities. They have the Google Words and the Microsoft Words and the Google Docs, and they'll inject those with LLM. We met a company, they were optimizing.

If you're building a commercial building, you need to route electrical cables through the building, and typically that costs about one to $2 million to do with a human. A large language model or a new machine learning model can do that for half or a third of the price. Who would've thought to build a company starting in that space? I would think of that as architecture because I know about this a little bit.

Ben Miller:

You have basically your subs civil or electrical MEP. MEP is the most complicated part of architecture. What you're saying is basically the MEP of an architecture of building, which could be a significant cost, could be done with software. I can imagine that totally. There are rules, if it's high voltage cable versus low voltage cable, is it managed by a union electrician? How do you have to route it? How do you have to send it? This is your domain more than it is mine.

Tomasz Tunguz:

Whole idea of how permitting works, you hand your paper in to some government official and they look at it and they show up and they check to see if it meets the codes. That's exactly what software can do. You have rules. Does the thing match the rules? The built environment is unstructured data.

If you can make the built environment structured data, these new machine learning applications, you can basically have instantaneous permitting real time as you build. It'd be amazing. It's not to say this will be the role of a human. It's just that a lot of the rote and repetitive work, that's better suited to a computer and the way they think.

Ben Miller:

When they invented the computer and they entered the office space, it changed how people worked. The pools of typists who sat in the middle of the office typing went away, but we ended up with way more workers and way more productivity.

Tomasz Tunguz:

Exactly.

Ben Miller:

I'm interested in two horizons. Everybody talks about the super long horizon. I'm going to interested in your thoughts around the 10 to 15 year horizon, but I'm also interested in the short-term horizon, five years, which at the moment the economy's on pins and needles. AI is clearly the most positive driver, but I have trouble seeing it impact the economy in the immediate term. How do you think about the impact people will see over a five year or less timeframe?

Tomasz Tunguz:

Great question. One of the questions we asked ourselves was, which will have more impact on US productivity, personal computer and the internet or machine learning? We pulled together a bunch of different research reports and it turns out that the personal computer did not grow productivity that much. Very, very little impact to GDP. Still significant, but not huge. The initial estimates, which are probably wildly over optimistic, say that the machine learning companies will produce a thousand times more lift to GDP than the personal computer. Whatever.

There's a lot of excitement over LLMs and those reports base their stat on this idea that about or mostly white collar workers, about 25 to 40% of their work will be automated in some form. The first example would be the software engineer, and we talked about this a bit before, half of code that software engineers write that's committed to GitHub is generated by a machine. And that code is standard code, things that you might see on the top of every file. You probably have a template for a memo.

Well, engineers have a similar notion of templates. A lot of that rote behavior is automated, and whether it actually saves them 50% of their time we can debate, but maybe it saves them 10% of their time. It doesn't mean to say that engineering workforces will decrease by 10% if anything. As we just talked about, I think they'll continue to grow in size because you can just achieve so much more. I think in the next five years we will see a modest improvement in productivity.

I think if you're a lawyer, you're probably already seeing it if you're on these modern platforms. If you are an author or a script writer in the movie business, you're definitely feeling this right now, which is why you're on strike. If you're a voice actor, with 60 seconds... We met a startup, you record your voice for 60 seconds and then they can haul my blog posts for the book, I will narrate it and the computer will take 30 minutes in order to produce that audio track. There's some industries that will be hit very, very quickly, impacted in a way, and that's a big productivity boost.

On the other hand, let's talk about self-driving cars for a second. Self-driving cars and the promise of that, that's been around for a long time, and that serves as a really great example of the S-curve that you see with machine learning models. What I mean by that is for a long time, progress is flat, and then we go through this period where it just seems to be improving so, so, so much faster and then we hit a plateau again. We've been on this curve for generative AI or machine learning in the last couple of years.

At some point it will plateau. What that means is you'll get to a 75 to 80% solution, but that margin of 10 to 15% to get to a 95% solution, that might take 20 years, which is what we're seeing with self-driving cars and self-driving trucks. It's hard to extrapolate at the beginning how fast things will change when we're in it. The cognitive bias is to continue to extrapolate that things will change at the rate that they are, but inevitably the slope with slope. That's why I think there's a handful of industries here that we'll see some pretty significant change upfront.

And then the accuracy is going to matter. If you're an attorney, well, the non-disclosure clause has to be exactly the right one. If you're a general contractor and you're putting the electrical cable, the difference between 90 to 95% accuracy, maybe it does matter, maybe it doesn't matter. The categories where a 90% solution, 80% solution is good enough, that's what you see the most impact in the short term.

Ben Miller:

Let me just pause for a second because we keep saying this and I just want to then actually have an explanation. You said LLM, large language model. Can you explain, if you can, essentially how people should understand what an LLM is?

Tomasz Tunguz:

AI, there are four different kinds. At the highest level, there are four different kinds. The first one are called classifiers. Is this a dog or is this a cat? Google made a lot of money with classifiers. The next one is time series prediction. That's like, will this stock go up or will this stock go down? Renaissance Technologies, all the high frequency traders, made a lot of money with time series prediction. The third one is what's called NLP, natural Language processing. When everyone had Siri on their phones, that was an example of NLP.

And then now we're in a place with these things called generative AI, which is when I learned in grad school about these things, they were called neural networks. It's a different way of training a computer to do a task. At the very basic level, at the very, very most basic level, if you use these models on your computer, you can actually see them doing this. What it does is you type in a word, let's ask a question, who is the President of the United States? It will go and try to figure that out.

It's called generative because what it does is it starts with one word and then it decides what is the highest probability word that should follow that first word. It's like a crystal. It starts with a little kernel, and then it adds, and then after it's added the second word as well. What is the highest probability third word? What is the highest probability force word? It's the same with an image generation model.

If I ask Midjourney, I can type in and say, "Produce for me an image of a firetruck in the style of Salvador Dali," it will take those words and it will start with a kernel, and then will add a little bit, and add a little bit, and add a little bit, and add a little bit. At the end, I'll have my image of a firetruck. That's how these models work at their core. That's the big innovation. Historically, they haven't worked because we haven't had enough data and we haven't had computers that are powerful enough to do this in a way where the outputs are good.

Ben Miller:

I think of it as like a giant pattern matching machine. You can think of lots and lots of pictures as basically pattern matching, pictures of people's faces. You can say, okay, that's pattern matching. The language is also pattern matching.

Tomasz Tunguz:

Yeah. There's this very famous critique of these models and they call them stochastic parrots. What does that mean? Parrot is just something that repeats what it's heard. Stochastic means you introduce a little bit of chaos into it. It just means you change it and mix it up. That's true. I mean, you can think about these models as computers that just repeat what they've seen elsewhere and add a little bit of chaos into it, let's say. But I think what people forget is in the 1950s when we were first starting to play with computers, we created these very simple programs.

What we noticed is if you have very simple programs, you can have very complex behavior that emerges from a five line program. Those were called automata. There was a guy named Von Neumann who created those. I think that's what we're seeing where we start with a very simple premise of let's predict the next word, let's predict the next word. And then as we do it in bigger and bigger scale, you get these beautifully complicated output that you couldn't have anticipated before.

Ben Miller:

Yeah, go to a little bit of microchip. All program underneath of it is zero, one, a gate, and you magnified that by billions, trillions. I don't know at this point, the scale of distributed computing.

Tomasz Tunguz:

It's either on or it's off.

Ben Miller:

There's a really good piece by Herbert Simon. I don't know if you've read it. It's a theory of all complexity and it's about hierarchy. There's hierarchy in the universe. Stacked atoms and atoms stacked to molecules and molecules stacked to amino acids, and that there's hierarchy and structure. What we do is we use a hierarchy to create simplicities and abstractions. Those simplicities and abstractions allow us to basically understand a complex world, and that's what the machines are doing as well with language or imagery.

Just because the underlying mechanics are simple doesn't mean that the outcomes are not going to be beautifully complicated.

Tomasz Tunguz:

Because that's a Meenakshi Sequence, one plus one is two, and you create these swirls. We see the symmetry in nature everywhere, whether it's sunflower seeds or the shape of a snail. You can start with a very simple rule, which is just add the two numbers beforehand and keep going up, this beautifully complex behavior. Very well said.

Ben Miller:

I want to just talk briefly about your decade of data theory because we are really focused on investing into that because I think it doesn't have as much of the speculation in a way. You know need data observability and data transformations and things like that, that basically you are underlying picks and shovels, as I said, it's to the AI revolution. It's not exactly how I think you've described it, but it's I think a really powerful thesis and I'd love to hear a little more about it.

Tomasz Tunguz:

I've been writing and thinking about this decade of data for a long time, and I first saw the power of it at Google. Google was a company that built systems that capitalized on getting as much data as they possibly could. What they discovered is for these machine learning algorithms, the data is actually more important than the innovations. Since then, a lot of the technologies that were inside of Google have come out and been commercialized.

It's not just Google that's now making money and improving their business on data, but it's almost every business in the world now relies on data in some form. I remember we traveled to India in 2008 with Google and we went to a fishing village. The mobile phones were all text, but they were all feature phones where it wasn't an iPhone. The fishermen were optimizing which ports they would go to sell their fish because they were getting market prices by the hour. If I have this particular kind of fish, I'll get more here versus there.

That's just an example of even in very, very remote areas, I think data is really impacting the world and how we conduct business. This decade of data is, okay, how do we take some of these innovations from these big companies and package them up in ways where fishermen in India can use it just as much as a manufacturing company in Cleveland? There's parts to it. Just the way we talked about the three layer cake for AI, there's a stack, there's a series of different products that you would use.

In this world of systems that produce data, there are systems that take that data and put it into a big database where you can ask questions of it. And then there are systems that allow you to visualize and analyze that data at the end. At a high level, we've been spending a lot of time there. In addition, there's security systems and making sure data is flowing and understanding how performance those systems are. In terms of market cap, it's probably four or $500 billion worth of value today.

As we talked about before, only a third of infrastructure or of IT spend globally has moved to the cloud. That includes Snowflake and Databricks and all those companies. We think the terminal penetration, so ultimately about two-thirds of all IT spend moves to the cloud, which means over the next 10 years you're going to see another set of Snowflakes, Databricks, and all the big companies that have been created so far.

Ben Miller:

You touched on it, but I've tried to go deeper on this in terms of how we've communicated about it. My analogy, because data is abstract idea, what is data? My parallel is the Industrial Revolution created manufacturing. Manufacturing, every single thing that you look around you right now is manufacturing, the table, the desk, the clothes, computer, cars, everything manufactured because the Industrial Revolution created essentially this new low cost scaled manufacturing.

Then you have now the virtual world rather than the physical world, and you have to basically create a system of manufacturing virtual goods. If you're basically making steel, you have to mine the rock. You have to then clean it, separate it. Then you got to melt it and transform it into something that's useful. Then you value add to it, and that might be turning into steel girders or kitchen hood. And then you need to go out and distribute it. And that kitchen hood needs to fit inside of a restaurant or a building.

That all exists in the data business. There's a lot of parallels and obviously the Industrial Revolution, manufacturing revolution created enormous value for society. Essentially now we're going to do that with knowledge and other kinds of virtual goods, and that's what the data infrastructure or decade of data is about.

Tomasz Tunguz:

You nailed it. It's a supply chain. You've got raw goods, raw data here, and you've got to send it to a factory where you can pre-process it and package it up and then get it to the right people in the forms that they care about.

Ben Miller:

It's hard for people, I think, including me, to think about knowledge as a virtual good. We're working on this with real estate. You said in A, there's all these sectors that are possible. Well, real estate is a sector. Another analogy for me is people used to write letters. The Gutenberg Printing Press just turned letters into structured data to now all of a sudden there's 26 separate letters.

They're now a structured format. You could then unleash a new kind of production of knowledge. What did the printing press create? The enlightenment and revolution. That's how exciting what's happening now is.

Tomasz Tunguz:

I totally agree. Another example I think about is 10 years ago, if I wanted to submit a form to somebody, I would fill out a piece of paper and I would mail it. And then what they would do is they would put it into a folder and a file in a cabinet and not doing anything as a piece of paper in there. Maybe at some point it's useful. And now we use DocuSign where all that information is now is captured, okay, how many tomatoes do I want to buy and how much celery do I want, or whatever it is.

And then that goes and feeds somebody else's process. It feeds the procurement team or the sales team. It's that transition. It's taking what used to be in the filing cabinet and actually allowing people across a business to understand what's going on in other parts of the company.

Ben Miller:

Taking it from a physical thing to a virtual thing, and then from a virtual thing into knowledge that's actionable. Because even if it's in my file, my computer, that information, my thousands of files, my folders, that information is not actionable today, but it will be because of data infrastructure and ML models. Let me just pan back out now as we get to the end here, because I've heard you talk about this and I think about this all the time, which is we went through a zero interest rate environment.

We went through quantitative easing. They printed $7 trillion. I don't know if you would agree with this, but one of the downstream consequences of that is that the sales and marketing budgets of software companies just absolutely exploded and they actually have trouble making money. They have trouble actually having profits because their sales and marketing is such a huge expense line item. It's third or half of their entire expenses. The reason they had to do that is there was so much money, there was just ruthless competition, and they were forced to in a way.

If you have a world where money's now expensive and scarcer, how do you think that's going to affect companies and exits and pricing and the kind of companies you want to be backing?

Tomasz Tunguz:

The next 10 years will not be like the last 10 years. You're right, the cost of customer acquisition for a software company has increased every year. That's a challenge, particularly when valuations are less and they've fallen, and that means that startups raise less money. I think what we're looking for are companies who have an efficient go-to-market model. That's been a big change in the venture industry as a whole, where the last 10 years it's been growth at any cost, and now the efficiency metrics matter probably just as much as the growth numbers do.

Let me make this a bit more concrete. Two or three years ago, you could be at an early stage company and you could grow from one to five to one to six. It would go from 1 million in revenue one year and then 6 million to the next, and that would put you in the top desk aisle of software companies. Absolutely exceptional. It wouldn't matter that you would be burning $10 million that year or $12 million that year because the growth was really fantastic. Those were the most highly sought after companies.

And now what's happened is a lot of businesses are saying, "I'm not going to grow one to five anymore. I'll grow one to three year or one to three and a half, but I'll only burn a certain amount. I'll only burn two or 3 million this year instead, or 4 million instead of 10 million." That's a trade. What that's going to do is the total amount of sales and marketing dollars, total number of salespeople and banner campaigns and conferences and the size of those conferences, they will go down.

It'll be self-regulating dynamic where because there's less competition in the market, maybe we'll see some deflation there. The other thing that we're looking for is companies who have found very efficient go-to-market strategies, so product-led growth. How can you use software to educate a buyer as opposed to paying a salesperson to do it for a big chunk of your customers? That's a couple of businesses. Atlassian has been phenomenal at doing this. There are many others.

And then the last is to try to compete in vertical categories where there isn't that amount of competition. You're the only one advertising or only one of two selling, as opposed to competing with a Salesforce who can really spend quite a lot of money to win.

Ben Miller:

Here's maybe my last question for you. You had some really interesting successes. I made a list here, Looker, Expensify, Hex, Dremio, Monte Carlo. What's maybe the hardest lesson you've learned? If there's a story or just a meta lesson after 15 years of doing venture?

Tomasz Tunguz:

There are lots of lessons. One is the market is created by startups. It's one thing to look at a market and say, "I think this market is small, or I think this market is really big, or it's more attractive for a startup or less than attractive than a startup." Think what I've often missed and what I try to keep in mind is it's actually the startup that shapes that market and they can change the outcome. They can take a small market and make it into a really big market.

Twilio would be an example of that where who knew that you could have a multi-billion dollar company selling infrastructure for sending text messages. That's a lesson I try to keep very, very close. And that's just because when I make the mistake of not investing in an incredible company, every time I drive down the highway or into San Francisco, I see that business's logo on a billboard or on the buildings. It's a constant reminder of there's a judgment that you make. It keeps you humble.

Ben Miller:

In venture, the deals you missed are the ones that you regret. In real estate, it's deals you did that are the ones you regret.

Tomasz Tunguz:

Yeah, what did Jordan say? You miss all the shots you don't take. It's the same idea.

Ben Miller:

There's a saying in venture, you make 99 investments and the 100th one makes it all. In real estate, you make 99 investments and the 100th one you lose it all.

Tomasz Tunguz:

Wow. What you're telling me is I should not take my venture investing mindset and start a commercial real estate plan.

Ben Miller:

It's different because there's so much leverage, so much debt in the system. But anyways, this was fabulous. Tomasz, I really appreciate it. Awesome. I'm so grateful to be working with you.

Tomasz Tunguz:

The pleasure is all mine. Really thrilled to be here and thanks for the conversation.

Ben Miller:

All right. Onward. You've been listening to Onward, the Fundrise Podcast, featuring Tomasz Tunguz, founder and general partner of Theory Ventures. My name is Ben Miller, CEO of Fundrise. We invite you again to please send your comments and questions to onward@fundrise.com. If you like what you've heard, rate and review us on Apple Podcasts. Be sure to follow us wherever you listen to podcasts. For more information on Fundrise sponsored investment products, including relevant legal disclaimers, check out our show notes. Thanks so much for joining me. We'll see you next episode.

Transcript: "Investing in AI, with Tomasz Tunguz"

Recommended articles

Understanding AI and its implications — Episode 27 of Onward, a Fundrise podcast

Investing in AI, with Tomasz Tunguz — Episode 28 of Onward, a Fundrise podcast