
Show notes
Google faces the greatest innovator's dilemma in history. They invented the Transformer — the breakthrough technology powering every modern AI system from ChatGPT to Claude (and, of course, Gemini). They employed nearly all the top AI talent: Ilya Sutskever, Geoff Hinton, Demis Hassabis, Dario Amodei — more or less everyone who leads modern AI worked at Google circa 2014. They built the best dedicated AI infrastructure (TPUs!) and deployed AI at massive scale years before anyone else. And yet... the launch of ChatGPT in November 2022 caught them completely flat-footed. How on earth did the greatest business in history wind up playing catch-up to a nonprofit-turned-startup? Today we tell the complete story of Google's 20+ year AI journey: from their first tiny language model in 2001 through the creation Google Brain, the birth of the transformer, the talent exodus to OpenAI (sparked by Elon Musk's fury over Google’s DeepMind acquisition), and their current all-hands-on-deck response with Gemini. And oh yeah — a little business called Waymo that went from crazy moonshot idea to doing more rides than Lyft in San Francisco, potentially building another Google-sized business within Google. This is the story of how the world's greatest business faces its greatest test: can they disrupt themselves without losing their $140B annual profit-generating machine in Search? Sponsors: Sentry: https://bit.ly/acquiredsentry WorkOS: https://bit.ly/workos25 Anthropic: https://bit.ly/acquiredclaude25 Statsig: https://bit.ly/acquiredstatsig26 Links: Sign up for email updates and vote on future episodes! Geoff Hinton’s 2007 Tech Talk at Google Our recent ACQ2 episode with Tobi Lutke Worldly Partners’ Multi-Decade Alphabet Study In the Plex Supremecy Genius Makers All episode sources Carve Outs: We’re hosting the Super Bowl Innovation Summit! F1: The Movie Travelpro suitcases Glue Guys Podcast Sea of Stars Stepchange Podcast More Acquired! Get email updates with hints on next episode and follow-ups from recent episodes Join the Slack Subscribe to ACQ2 Merch Store ! © Copyright 2015-2026 ACQ, LLC Note: Acquired hosts and guests may hold assets discussed in this episode. This podcast is not investment advice, and is intended for informational and entertainment purposes only. You should do your own research and make your own independent decisions when considering any financial transactions.
Highlighted moments
“The entire AI revolution that we are in right now is predicated by the invention of the transformer out of the Google brain team in 2017. So think OpenAI and ChatGPT, Anthropic, NVIDIA, hitting all-time highs. All the craziness right now depends on that one research paper published by Google in 2017.”
“they stop all conversations after five turns. So, you can only have five turns of conversation with the chatbot, and then it's just, and we're done for today.”
“A 50% gross margin from your supplier or an 80% gross margin from your supplier is the difference between a 2x markup and a 5x markup.”
Transcript
Introduction
0:00I went and looked at a studio, well, a little office that was going to turn into a studio nearby, but it was not good at all. It had dropped ceilings, so I could hear the guy in the office next to me. You would be able to hear him talking on episodes.
0:12Third co-host. Third co-host. Is it Howard? No, it was like a lawyer. He seemed to be like talking through some horrible problem that I didn't want to listen to, but I could hear every word. Does he want millions of people listening to his conversations? Right.
0:27All right. All right. Let's do a podcast. Let's do a podcast. Who got the truth?
0:35Is it you? Is it you? Is it you? Who got the truth now? Is it you? Is it you? Is it you? Sit me down, say it straight. Another story on the way. Who got the truth?
Welcome to Acquired
0:49Welcome to the fall 2025 season of Acquired, the podcast about great companies and the stories and playbooks behind them. I'm Ben Gilbert. I'm David Rosenthal. And we are your hosts. Here's a dilemma. Imagine you have a profitable business. You make giant margins on every single unit you sell. And the market you compete in is also giant. One of the largest in the world, you might say. But then on top of that, lucky for you, you also are a monopoly in that giant market with 90% share and a lot of lock-in.
1:23And when you say monopoly, monopoly as defined by the U.S. government. That is correct. But then imagine this. In your research lab, your brilliant scientists come up with an invention. This particular invention, when combined with a whole bunch of your old inventions by all your other brilliant scientists, turns out to create the product that is much better for most purposes than your current product. So you launched the new product based on this new invention, right? Right. I mean, especially because out of pure benevolence, your scientists had published research papers
1:57about how awesome the new invention is and lots of the inventions before also. So now there's new startup competitors quickly commercializing that invention. So of course, David, you change your whole product to be based on a new thing, right? Uh, this sounds like a movie. Yes. But here is the problem. You haven't figured out how to make this new incredible product anywhere near as profitable as your old giant cash printing business. So maybe you shouldn't launch that new
Google's Dilemma
2:28product. David, this sounds like quite the dilemma to me. Of course, listeners, this is Google today and in perhaps the most classic textbook case of the innovators dilemma ever. The entire AI revolution that we are in right now is predicated by the invention of the transformer out of the Google brain team in 2017. So think OpenAI and ChatGPT, Anthropic, NVIDIA, hitting all-time highs. All the craziness right now depends on that one research paper published by Google in 2017. And consider this. Not only did Google have
3:03the densest concentration of AI talent in the world 10 years ago that led to this breakthrough, but today they have just about the best collection of assets that you could possibly ask for. They've got a top-tier AI model with Gemini. They don't rely on some public cloud to host their model. They have their own in Google Cloud that now does $50 billion in revenue. That is real scale. They're a chip company with their tensor processing units or TPUs, which is the only real scale deployment of AI chips in the world besides NVIDIA GPUs. Maybe AMD, maybe, but these are definitely
3:40the top two. Somebody put it to me in research that if you don't have a foundational frontier model or you don't have an AI chip, you might just be a commodity in the AI market. And Google is the only company that has both. Google still has a crazy bench of talent. And despite ChatGPT becoming kind of the Kleenex of the era, Google does still own the text box, the single one that is the front door to the internet for the vast majority of people anytime anyone has intent to do anything online.
4:14But the question remains, what should Google do strategically? Should they risk it all and lean into their birthright to win in artificial intelligence? Or will protecting their gobs of profits from search, hamstring them as the AI wave passes them by? But perhaps first,
Google's History
4:32we must answer the question, how did Google get here, David Rosenthal? So listeners, today, we tell the story of Google, the AI company. Woo. You like that, David? I love it. I love it. Did you hire like a Hollywood script writing consultant without telling me? I wrote that 100% myself with no AI. Thank you very much. No AI. Well, listeners, if you want to know every time an episode drops, vote on future episode topics or get access to corrections from past episodes, check out our email list. That's
5:03acquired.fm slash email. Come talk about this episode with the entire Acquired community in Slack after you listen. That's acquired.fm slash Slack. Speaking of the Acquired community, we have an anniversary celebration coming up. We do. 10 years of the show. We're going to do an open Zoom call with everyone to celebrate, kind of like how we used to do our LP calls back in the day with LPs. And we are going to do that on October 20th, 2025 at 4 p.m Pacific time. Check out the show notes for more details.
5:36If you want more acquired, check out our interview show, ACQ2. Our last interview was super fun. We sat down with Toby Lutke, the founder and CEO of Shopify, about how AI has changed his life and where he thinks it will go from here. So search ACQ2 in any podcast player. So with that, this show is not investment advice. David and I may have investments in the companies we discuss. And this show is for informational and entertainment purposes only. David, Google, the AI company.
Google's AI Beginnings
6:03So Ben, as you were alluding to in that fantastic intro, really, you're really up in the game here. If we rewind 10 years ago from today, before the Transformer paper comes out, all of the following people, as we've talked about before, were Google employees. Ilya Sitzkever, founding chief scientist of OpenAI, who along with Jeff Hinton and Alex Koshefsky had done the seminal AI work on AlexNet and just published that a few years before all three of them were Google
6:38employees, as was Dario Amadei, the founder of Anthropic, Andre Karpathy, chief scientist at Tesla until recently, Andrew Ng, Sebastian Thrun, Noam Shazir, all the DeepMind folks, Demis Hassabis, Shane Legge, Mustafa Suleiman. Mustafa, now, in addition to in the past, having been a founder of DeepMind, runs AI at Microsoft. Basically, every single person of note in AI worked at Google, with the one exception of Jan LeCun, who worked at Facebook.
Neural Networks
7:12Yeah. It's pretty difficult to trace a big AI lab now back and not find Google in its origin story. Yeah. I mean, the analogy here is it's almost as if at the dawn of the computer era itself, a single company like, say, IBM had hired every single person who knows how to code. So it'd be like, you know, if anybody else wants to write a computer program, oh, sorry, you can't do that. Anybody who knows how to program works at IBM. This is how it was with AI and Google in the mid-2010s. But learning how to program a computer wasn't so hard that people out there couldn't learn how to do
7:46it. Learning how to be an AI researcher, significantly more difficult. Right. It was the stuff of very specific PhD programs with a very limited set of advisors and a lot of infighting in the field of where the direction of the field was going, what was legitimate versus what was crazy, heretical, religious stuff. Yep. So then, yes, the question is, how do we get to this point? Well, it goes back to
Google's AI Research
8:09the start of the company. I mean, Larry Page always thought of Google as an artificial intelligence company. And in fact, Larry Page's dad was a computer science professor and had done his PhD at the University of Michigan in machine learning and artificial intelligence, which was not a popular field in computer science back then. Yeah. In fact, a lot of people thought specializing in AI was a waste of time because so many of the big theories from 30 years prior to that had been kind of disproven at that point, or at least
8:40people thought they were disproven. And so it was frankly contrarian for Larry's dad to spend his life and career and research work in AI. And that rubbed off on Larry. I mean, if you squint, PageRank, the PageRank algorithm that Google was founded upon, is a statistical method. You could classify it as part of AI within computer science. And Larry, of course, was always dreaming much, much bigger. I mean, there's the quote that we've said before on this show. In the year 2000, two years after Google's founding, when Larry says
9:13artificial intelligence would be the ultimate version of Google, if we had the ultimate search engine, it would understand everything on the web, it would understand exactly what you wanted, and it would give you the right thing. That's obviously artificial intelligence. We're nowhere near doing that now. However, we can get incrementally closer. And that is basically what we work on here. It's always been an AI company. Yep. And that was in 2000. Well, one day, in either late 2000 or early 2001, the timelines are a bit hazy here, a Google engineer
9:44named Georges Herrick is talking over lunch with Ben Gomes, famous Google engineer who I think would go on to lead search, and a relatively new engineering hire named Noam Shazir. Now, Georges was one of Google's first 10 employees, incredible engineer. And just like Larry Page's dad, he had a PhD in machine learning from the University of Michigan. And even when Georges went there, it was still a relatively rare contrarian subfield within computer science. So the three of them are having lunch,
10:16and Georges says offhandedly to the group that he has a theory from his time as a PhD student that
Sebastian Thrun
10:23compressing data is actually technically equivalent to understanding it. And the thought process is, if you can take a given piece of information and make it smaller, store it away, and then later re-instantiate it in its original form, the only way that you could possibly do that is if whatever force is acting on the data actually understands what it means. Because you're losing information, going down to something smaller, and then recreating the original thing. It's like you're a kid in
10:55school. You learn something in school, you read a long textbook, you store the information in your memory, then you take a test to see if you really understood the material. And if you can recreate the concepts, then you really understand it. Which kind of foreshadows big LLMs today are like compressing the entire world's knowledge into some number of terabytes that's just like the smashed down little vector set. Little, at least compared to all the information in the world. But it's kind of that idea, right? You can store all the world's information in an AI model in something that is
11:27like kind of incomprehensible and hard to understand. But then if you uncompress it, you can kind of bring knowledge back to its original form. Yep. And these models demonstrate understanding, right? Eh, do they? That's the question. That's the question. They certainly mimic understanding. So this conversation is happening. You know, this is 25 years ago. And Noam, the new hire, the, you know, young buck, he sort of stops in his tracks and he's like, wow. If that's true, that's really profound. Is this in one of Google's micro kitchens?
11:58This is in one of Google's micro kitchens. They're having lunch. Where did you find this, by the way, a 25-year-old? This is in In the Plex. This is like a small little passage in Stephen Levy's great book that's been a source for all of our Google episodes, In the Plex. There's a small little throwaway passage in here about this because this book came out before ChatGPT and AI and all that. So Noam kind of latches on to Georges and keeps vibing over this idea. And over the next couple months, the two of them decide in the most googly fashion possible that they are just going to stop
12:29working on everything else. And they're going to go work on this idea of language models and compressing data. And can they generate machine understanding with data? And if they can do that, that that would be good for Google. I think this coincides with that period in 2001 when Larry Page fired all the managers in the engineering organization. And so everybody was just doing whatever they wanted to do. Funny. So there's this great quote from Georges in the book.
Google Brain
12:57A large number of people thought it was a really bad thing for Noam and I to spend our talents on. But Sanjay Gamowat, Sanjay of course being Jeff Dean's famous prolific coding partner, thought it was cool. So Georges would posit the following argument to any doubters that they came across. Sanjay thinks it's a good idea. And no one in the world is as smart as Sanjay. So why should Noam and I accept your view that it's a bad idea? It's like if you beat the best team in football, are you the new best team in football no matter what?
13:31Yeah. So all of this ends up taking Noam and Georges deep down the rabbit hole of probabilistic models for natural language. Meaning, for any given sequence of words that appears on the internet, what is the probability for another specific sequence of words to follow? This should sound pretty familiar for anybody who knows about LLM's work today. Oh, kind of like a next word predictor. Yeah, or next token predictor if you generalized it. Yep. So the first thing that they do with this work is they create the did you mean spelling correction
14:06in Google search. Oh, that came out of this? That came out of this. Noam created this. So this is huge for Google because obviously it's a bad user experience when you mistype a query and then need to type another one. But it's a tax to Google's infrastructure because every time these mistyped queries are going, well, Google's infrastructure goes and serves the results to that query that are useless and immediately overwritten with the new one. Right. And it's a really tightly scoped problem where you can see like, oh, wow, 80% of the time
14:37that someone types in God groomer, oh, they actually mean dog groomer and they retype it. And if it's really high confidence, then you actually just correct it without even asking them and then ask them if they want to opt out instead of opting in. It's a great feature and it's sort of a great first use case for this in a very narrowly scoped domain. Totally. So they get this way and they keep working on it, Noam and George, and they end up creating a fairly large, I'm using large in quotes here, you know, for the time language
15:08model that they call affectionately Phil, the probabilistic hierarchical inferential learner. These AI researchers love creating their acronyms. They love their word puns. Yeah. Yep. So fast forward to 2003 and Susan Majiski and Jeff Dean are getting ready to launch AdSense. They need a way to understand the content of these third party web pages, the publishers, in order to run the Google ad corpus against them. Well, Phil is the tool that they use to do it.
15:45Huh. I had no idea that language models were involved in this. Yeah. So Jeff Dean borrows Phil and famously uses it to code up his implementation of AdSense in a week because he's Jeff Dean. And boom, AdSense. I mean, this is billions of dollars of new revenue to Google overnight because it's the same corpus of ads that are AdWords that are search ads that they're now serving on third party pages. They just massively expanded the inventory for the ads that they already have in the system. Thanks to Phil.
16:18Thanks to Phil. All right. This is a moment where we got to stop and just give some Jeff Dean facts. Jeff Dean is going to be the through line of this episode of, wait, how did Google pull that off? How did Jeff Dean just go home and over the weekend rewrite some entire giant distributed system and figure out all of Google's problems? Back when Chuck Norris facts were big, Jeff Dean facts became a thing internally at Google. I just want to give you some of my favorites. The speed of light in a vacuum used to be about 35 miles per hour. Then Jeff Dean spent a weekend optimizing physics. So good.
16:52Jeff Dean's pin is the last four digits of pi.
16:58Only Googlers would come up with these. Yes. To Jeff Dean, NP means no problemo. Oh, yeah. I've seen that one before. I think that one's my favorite. Yes.
Jeff Dean Facts
17:08Oh, man. So, so good. Also, a wonderful human being who we spoke to in research and was very, very helpful. Thank you, Jeff. Yes. So, language models. Definitely work. Definitely going to drive a lot of value for Google. And they also fit pretty beautifully into Google's mission to organize the world's information and make it universally accessible and useful. If you can understand the world's information and compress it and then recreate it. Yeah, that fits the mission, I think. I think that checks the box.
17:38Absolutely. So, Phil gets so big that apparently by the mid-2000s, Phil is using 15% of Google's entire data center infrastructure. And I assume a lot of that is AdSense ad serving, but also DidYouMean and all the other stuff that they start using it for within Google. So, early natural language systems, computationally expensive. Yes. So, okay. Now, mid-2000s, fast forward to 2007, which is a very, very big year for the purposes of our story.
18:11Google had just recently launched the Google Translate product. This is the era of all the great, great products coming out of Google that we've talked about. Maps and Gmail and Docs and all the wonderful things that Chrome and Android are going to come later. They had like a 10-year run where they basically launched everything you know of at Google except for search. Truly in a 10-year run. And then there were about 10 years after that from 2013 on where they basically didn't launch any new products that you've heard about until we get to Gemini, which
18:42is this fascinating thing. But this 03 to 2013 era was just so rich with hit after hit after hit. Magical. And so one of those products was Google Translate, you know, not the same level of user base or perhaps impact on the world as Gmail or Maps or whatnot. But still a magical, magical product. And the chief architect for Google Translate was another incredible machine learning PhD named Franz Ock. So Franz had a background in natural language processing and
19:14machine learning. And that was his PhD. He was German. He got his PhD in Germany. At the time, DARPA, the Defense Advanced Research Projects Agency, division of the government, had one of their famous challenges going for machine translation. So Google and Franz, of course, enters this. And Franz builds an even larger language model that blows away the competition in this year's version of the DARPA challenge. This is either 2006 or 2007. Gets a astronomically high blue score for
19:50the time. It's called the Bilingual Evaluation Understudy is the sort of algorithmic benchmark for judging the quality of translations. At the time, higher than anything else possible. So Jeff Dean hears about this and the work that Franz and the Translate team have done. And it's like, this is great. This is amazing. When are you guys going to ship this in production? Oh, I heard this story. So Jeff and Noam talk about this on the Dwarkesh podcast. Yes. That episode is so, so good. And Franz is like, no, no, no, no, no, Jeff, you don't understand.
20:21This is research. This isn't for the product. We can't ship this model that we built. This is a n-gram language model. Grams are like a number of words in a cluster. And we've trained it on a corpus of 2 trillion words from the Google search index. This thing is so large, it takes it 12 hours to translate a sentence. So the way the DARPA challenge worked in this case was you got a set of sentences on Monday, and then you had to submit your machine translation of those set of sentences by Friday.
20:56Plenty of time for the servers to run. Yeah. They were like, okay, so we have whatever number of hours it is from Monday to Friday. Let's use as much compute as we can to translate these couple sentences.
21:08Hey, learn the rules of the game and use them to your advantage. Exactly. So Jeff Dean being the engineering equivalent of Chuck Norris, he's like, hmm, let me see your code. So Jeff goes and parachutes in and works with the Translate team for a few months. And he re-architects the algorithm to run on the words in the sentences in parallel instead of sequentially. Because when you're translating a set of sentences or a set of words in a sentence, you don't necessarily need to do it in order. You can break up the problem into different pieces,
21:41work on it independently. You can parallelize it. And you won't get a perfect translation, but, you know, imagine you just translate every single word. You can at least go translate those all at the same time in parallel, reassemble the sentence, and like mostly understand what the initial meaning was. Yeah. And as Jeff knows very well, because he and Sanjay basically built it with Erz Holza, Google's infrastructure is extremely parallelizable and distributed. You can break up workloads into little chunks, send them all over the various data centers that Google has,
22:12reassemble the projects, return that to the user. They are the single best company in the world at parallelizing workloads across CPUs, across multiple data centers. CPUs. We're still talking CPUs here. Yep. And Jeff's work with the team gets that average sentence translation time down from 12 hours to 100 milliseconds. And so then they ship it in Google Translate. And it's amazing. This sounds like a Jeff Dean fact. Well, you know, it used to take 12 hours,
22:42and then Jeff Dean took a few months with it. Now it's 100 milliseconds. Right, right, right, right, right. So this is the first large, I'm using large in quotes here, language model used in production in a product at Google. They see how well this works. Like, hmm, maybe we could use this for other things, like predicting search queries as you type. That might be interesting. And of course, the crown jewel of Google's business that also might be
23:13interesting application for this, the ad quality score for AdWords is literally the predicted click-through rate on a given set of ad copy. You can see how an LLM that is really good at ingesting information, understanding it, and predicting things based on that might be really useful for calculating ad quality for Google. Yep, which is the direct translation to Google's bottom line. Indeed. Okay. So obviously, all of that is great on the language model front. I said 2007 was a big year.
23:49Also, in 2007, begins the sort of momentous intersection of several computer science professors on the Google campus. So in April of 2007, Larry Page hires Sebastian Thrun from Stanford to come to Google and work first part-time and then full-time on machine learning applications. Sebastian was the head of SAIL at Stanford, the Stanford Artificial Intelligence Laboratory,
24:24legendary AI laboratory that was big in the sort of first wave of AI back in the 60s, 70s, when Larry's dad was active in the field, then actually shut down for a while and then had been restarted and re-energized here in the early 2000s. And Sebastian was the leader, the head of SAIL. Well, funny story about Sebastian, the way that he actually comes to Google. Sebastian was kind enough to speak with us to prep for this episode. I didn't realize it was basically an acqui-hire. He and some, I think it was grad students, were in the process of starting a company, had term sheets
24:58from Benchmark and Sequoia. Yes. And Larry came over and said, what if we just acquire your company before it's even started in the form of signing bonuses? Yes. Probably a very good decision on their part. So SAIL, this group within the CS department at Stanford, not only had some of the most incredible, most accomplished professors and PhD AI researchers in the world, they also had this stream of Stanford undergrads that would come through and work there as researchers while they were working
25:31on their CS degrees or symbolic system degrees or, you know, whatever it was that they were doing as Stanford undergrads. One of those people was Chris Cox, who's the chief product officer at Meta. Yeah. That was kind of how he got his start in all of this and AI and obviously Facebook and Meta are going to come back into the story here in a little bit. Wow. You really can't make this up. Another undergrad who passed through SAIL while Sebastian was there was a young freshman and sophomore who would later drop out of Stanford to start a company
26:05that went through Y Combinator's very first batch in summer 2005. I'm on the edge of my seat. Who is this? Any guesses?
26:16Dropbox, Reddit, trying to think who else was in the first batch. Oh, no, no. But way more on the nose for this episode. The company was a failed local mobile social network. Oh, Sam Altman looped. Sam Altman.
26:34That's amazing. He was at SAIL at the same time? He was at SAIL. Yep. As an undergrad researcher. Wow. Wild, right? We told you that it's a very small set of people that are all doing all of this. Man, I miss those days. Sam presenting at the WWDC with Steve Jobs on stage with the Double Popped Collar. Right. Different time in tech. The Double Popped Collar. That was amazing. That was a vibe. That was a moment. Oh, man. All right. So, April 2007, Sebastian comes over from SAIL into Google, Sebastian Thrun.
27:06One of the first things he does over the next set of months is a project called Ground Truth for Google Maps. Which is essentially Google Maps. It is essentially Google Maps. So, before Ground Truth, Google Maps existed as a product, but they had to get all the mapping data from a company called TeleAtlas. I think there were two. They were sort of a duopoly. Navtech was the other one. Yeah. Navtech and TeleAtlas. But it was this, like, kind of crappy source of truth map data that everyone used and you really couldn't do any better than anyone else because you all just used the same data.
27:38Yep. It was not that good and it cost a lot of money. TeleAtlas and Navtech were multi-billion dollar companies. I think maybe one or both of them were public at some point and got acquired, but a lot of money, a lot of revenue. Yep. And Sebastian's first thing was Street View, right? So, he already had the experience of orchestrating this fleet of all these cars to drive around and take pictures. Yes. So then, coming into Google, Ground Truth is this sort of moonshot type project to recreate all the TeleAtlas data. Mostly from their own photographs of streets from Street View. And they
28:14incorporated some other data. There was, like, census data they used. I think it was 40-something data sources to bring it all together. But Ground Truth was this very ambitious effort to create new maps from whole cloth. Yep. And just like all of the AI and AI-enabled projects within Google that we're talking about here works very, very well. Very quickly. Huge win. Well, especially when you hire 1,000 people in India to help you sift through all the discrepancies in the data and actually hand-draw all the maps. Yes. We are not yet in an era of a whole lot of AI automation. So, on the back of this win with
28:48Ground Truth, Sebastian starts lobbying to Larry and Sergey, hey, we should do this a lot. We should bring in AI professors, academics, I know all these people, into Google part-time. They don't have to be full-time employees. Let them keep their posts in academia, but come here and work with us on projects for our products. They'll love it. They get to see their work used by millions and millions of people. We'll pay them. They'll make a lot of money. They'll get Google stock. And they get to stay professors at their academic institutions.
29:18Win, win, win. Win, win, win. So, as you would expect, Larry and Sergey are like, yeah, yeah, yeah. That's a good idea. Let's do that. More of that. So, in December of 2007, Sebastian brings in a relatively little-known machine learning professor from the University of Toronto named Jeff Hinton to the Google campus to come and give a tech talk. Not yet hiring him, but come give a tech talk to, you know, all the folks at Google and talk about some of the new work, Jeff, that you and your PhD and postdoc students there at the
29:54doing on blazing new paths with neural networks. And Jeff Hinton, for anybody who doesn't know the name, now very much known as the godfather of neural networks and really the godfather of kind of the whole direction that AI went in. Modern AI. He was kind of a fringe academic. Yeah. At this point in history. I mean, neural networks were not a respected subtree of AI. No, totally not. And part of the reason is there had been a lot of hype 30, 40 years before around neural
30:28networks that just didn't pan out. So, it was effectively, everyone thought, disproven and certainly backwater. Yep. Ben, do you remember from our NVIDIA episodes my favorite piece of trivia about Jeff Hinton? Oh, yes. That his grandfather, great-grandfather, was George Boole? Yep. He is the great-great-grandson of George and Mary Boole, who invented Boolean algebra and Boolean logic. Which is hilarious now that I know more about this because that's the basic building block
30:59of symbolic logic, of defined deterministic computer science logic. And the hilarious thing about neural nets is it's not. It's not symbolic AI. It's not I feed you the specific instructions and you follow a big if-then tree. It is non-deterministic. It is the opposite of that field. Which actually just underscores, again, how sort of heretical this branch of machine learning and computer science was. Right. So, Ben, as you were saying earlier, neural networks, not a new idea and had all of this great
31:32promise in theory, but in practice just took too much computation to do multiple layers. You could really only have a single or maybe small single-digit number of layers in a computer neural network up until this time. But Jeff and his former postdoc, a guy named Jan Lacoon, started vandalizing within the community, hey, if we can find a way to have multi-layered, deep-layered neural networks, something we call deep learning, we could actually realize the
32:06promise here. It's not that the idea is bad. It's that the implementation, which would take a ton of compute to actually do all the math, to do all the multiplication required to propagate through layer after layer after layer of neural networks to sort of detect and understand and store patterns. If we could actually do that, a big multi-layered neural network would be very valuable and possibly could work. Yes. Here we are now in 2007, mid-2000s. Moore's law has increased
32:38enough that you could actually start to try to test some of these theories. Yep. So, Jeff comes and he gives this talk at Google. It's on YouTube. You can go watch it. We'll link to it in the show notes. This is incredible. This is an artifact of history sitting there on YouTube. And people at Google, Sebastian, Jeff Dean, and all the other folks who are talking about, they get very, very, very excited because they've already been doing stuff like this with Translate and the language models that they're working with. That's not using deep neural networks that Jeff's working on. So here's this
33:12whole new architectural approach that if they could get it to work, would enable these models that they're building to work way better, recognize more sophisticated patterns, understand the data better. Very, very promising. Again, kind of all in theory at this point. Yeah. So, Sebastian Thrun brings Jeff Hinton into the Google fold after this tech talk. I think first as a consultant over the next couple of years. And then this is amazing. Later, Jeff Hinton technically becomes an intern at
33:43Google. Like that's how they get around the part-time, full-time policies here. Yep. He was a summer intern in somewhere around 2011, 2012. And mind you, at this point, he's like 60 years old. Yes. So in the next couple of years after 2007 here, Sebastian's concept of bringing these computer science, machine learning academics into Google as contractors or part-time interns, basically letting them keep their academic posts and work on big projects for Google's
34:13products internally, goes so well that by late 2009, Sebastian and Larry and Sergey decided, hey, we should just start a whole new division within Google. And it becomes Google X, the moonshot factory. The first project within Google X, Sebastian leads himself. Ooh, David, don't say it. Don't say it. I won't say the name of it. We will come back to it later. But for our purposes for now, the second project would be critically important, not only for our story, but...
34:45To the whole world. Everything in AI changing the entire world. And that second project is called Google Brain. All right, listeners. This is a great time to thank one of our favorite companies here at Acquired. Sentry. That's S-E-N-T-R-Y, like someone standing guard. Yes. Sentry helps developers debug errors and latency issues pretty much any software problem and fix them before users get mad. As their homepage puts it, they are considered not bad
35:16by over 4 million software developers. Today, we are talking about the way Sentry works with another company in the acquired universe, Anthropic. Anthropic used to have some older infrastructure monitoring in place, but at their massive scale and complexity, they instead adopted Sentry to help them fix issues faster. Yep. Crashes can be a massive problem in AI. If you're running a huge compute job like training a model and one node fails, it can affect hundreds or thousands of servers. Sentry helped them detect bad hardware so they could quickly reject it before causing a cascading problem.
35:50Sentry also enabled them to debug massive issues in hours instead of days so they could get back to their training runs. And today, Anthropic relies on Sentry to track exceptions, assign errors, and analyze failures in real time across all of the primary languages used by Anthropic's research teams, including Python, Rust, and C++. According to the Anthropic team, Sentry gives our developers one place that will have all the information they need to debug an issue. And speaking of AI, Sentry now has an AI debugger called SEER. SEER is an AI agent that taps into
36:22all the issue context from Sentry and your code base to not just guess, but root cause gnarly issues and propose merge-ready fixes specific to your application. We're pumped to be working with Sentry. They have an incredible customer list, including not only Anthropic, but Cursor, Vercel, Linear, and more. If you want to fix your broken code fast, like over 150,000 other organizations that use Sentry, from indie hobbyists to some of the biggest companies in the world, you can check out sentry.io slash acquired. That's S-E-N-T-R-Y
36:53dot I-O slash acquired. And just tell them that Ben and David sent you. Yes, and they're offering two months free to all acquired listeners. Yes. Thank you, Sentry.
DeepMind Acquisition
37:03All right, David. So Google brain. So, when Sebastian left Stanford full-time and joined Google full-time, of course, somebody else had to take over SAIL. And the person who did is another computer science professor, a brilliant guy named Andrew Ng. This is like all the hits. All the hits. This is all the AI hits on this episode. So, what does Sebastian do? He recruits Andrew to come part-time, start spending a day a week on the Google campus. And this coincides right with the start of X and Sebastian formalizing
37:37this division. So, one day in 2010-2011 timeframe, Andrew's spending his day a week on the Google campus and he bumps into, who else? Jeff Dean. And Jeff Dean is telling Andrew about what he and Franz have done with language models and what Jeff Hinton is doing in deep learning. And of course, Andrew knows all this. And Andrew's talking about what he and SAIL are doing at Stanford. And they decide, you know, the time might finally be right to try and take a real big swing on this
38:08within Google and build a massive, really large, deep learning model in the vein of what Jeff Hinton has been talking about on highly paralyzable Google infrastructure. And when you say the time might be right, Google had tried twice before and neither project really worked. They tried this thing called Brains on Borg. Borg is sort of an internal system that they use to run all of their infrastructure. They tried the Cortex project and neither of these really worked. So, there's a little bit of scar tissue in the sort of research group at Google of
38:41are large-scale neural networks actually going to work for us on Google infrastructure? So, the two of them, Andrew Ng and Jeff Dean, pull in Greg Corrado, who is a neuroscience PhD and amazing researcher who was already working at Google. And in 2011, the three of them launch the second official project within X, appropriately enough, called Google Brain. And the three of them get to work building a really, really big, deep neural network model.
39:14And if they're going to do this, they need a system to run it on. You know, Google is all about taking this sort of frontier research and then doing the architectural and engineering system to make it actually run. Yes. So, Jeff Dean is working on this system, on the infrastructure, and he decides to name the infrastructure, dist-belief, which of course is a pun, both on the distributed nature of the system and also on, of course, the word disbelief because...
39:45No, I thought it was going to work. Most people in the field thought this was not going to work. And most people in Google thought this was not going to work. And here's a little bit on why. And it's a little technical, but follow me for a second. All the research from that period of time pointed to the idea that you needed to be synchronous. So all the compute needed to be sort of really dense, happening on a single machine with really high parallelism, kind of like what GPUs do, that you really would want it all sort of happening in one place. So it's really easy to kind of go look up and see, hey, what are the computed values for
40:16everything else in the system before I take my next move? What Jeff Dean wrote with dist-belief was the opposite. It was distributed across a whole bunch of CPU cores and potentially all over a data center or maybe even in different data centers. So in theory, this is really bad because it means you would need to be constantly waiting around on any given machine for the other machines to sync their updated parameters before you could proceed. But instead, the system actually worked asynchronously without bothering to go and get the latest parameters from other cores.
40:49So you were sort of updating parameters on stale data. You would think that wouldn't work. The crazy thing is it did. Yes. Okay. So you've got dist-belief. What do they do with it now? They want to do some research. So they try out, can we do cool neural network stuff? And what they do in a paper that they submitted in 2011, right at the end of the year, is, and I'll give you the name of the paper first, building high-level features using large-scale unsupervised learning. But everyone just calls it
41:23the cat paper. You talk to anyone at Google, you talk to anyone in AI, they're like, oh yeah, the cat paper. What they did was they trained a large nine-layer neural network to recognize cats from unlabeled frames of YouTube videos using 16,000 CPU cores on 1,000 different machines. And listeners, just to underscore how seminal this is, we actually talked with Sundar in prep for the episode, and he cited seeing the cat paper come across his desk as one of the key moments that sticks in his
41:58brain in Google's story. Yeah. A little later on, they would do a TGIF where they would present the results of the cat paper. You talk to people at Google, they're like, that TGIF, oh my God, that's when it all changed. Yeah. It proved that large neural networks could actually learn meaningful patterns without supervision and without labeled data. And not only that, it could run on a distributed system that Google built to actually make it work on their infrastructure. And that is a huge unlock of the whole thing. Google's got this big infrastructure asset. Can we take this theoretical
42:33computer science idea that the researchers have come up with and use disbelief to actually run it on our system? Yep. That is the amazing technical achievement here. That is almost secondary to the business impact of the cat paper. I think it's not that much of a leap to say that the cat paper led to probably hundreds of billions of dollars of revenue generated by Google and Facebook and by
43:03dance over the next decade. Definitely. Pattern recognizers in data. So YouTube had a big problem at this time, which was that people would upload these videos and there's tons of videos being uploaded to YouTube, but people are really bad at describing what is in the videos that they're uploaded. And YouTube is trying to become more of a destination site, trying to get people to watch more videos, trying to build a feed, increase dwell time, etc., etc. And the problem is
43:35the recommender is trying to figure out what to feed and it's only just working off titles and descriptions that people were writing about their own videos. Right. And whether you're searching for a video or they're trying to figure out what video to recommend next, they need to know what the video is about. Yep. So the cat paper proves that you can use this technology, a deep neural network running on disbelief, to go inside of the videos in the YouTube library and understand what they were about and use that
44:06data to then figure out what videos to serve to people. If you can answer the question cat or not a cat, you can answer a whole lot more questions, too. Here's a quote from Jeff Dean about this. We built a system that enabled us to train pretty large neural nets through both model and data parallelism. We had a system for unsupervised learning on 10 million randomly selected YouTube frames, as you were saying, Ben. It would build up unsupervised representations based on trying to reconstruct the frame from the high-level representations. We got that working and training on 2,000 computers using 16,000 cores. After a little while, that model was actually
44:42able to build a representation at the highest neural net level where one neuron would get excited by images of cats. It had never been told what a cat was, but it had seen enough examples of them in the training data of head-on facial views of cats that that neuron would then turn on for cats and not much else. It's so crazy. I mean, this is the craziest thing about unlabeled data, unsupervised learning, that a system can learn what a cat is without ever being explicitly told what a cat is. And that there's a cat neuron. Yeah. And so then there's a iPhone neuron
45:16and a San Francisco Giants neuron and all the things that YouTube recommends. Not to mention porn filtering, explicit content filtering. Not to mention copyright identification and enabling revenue share with copyright holders. Yeah, this leads to everything in YouTube. Basically puts YouTube on the path to today becoming the single biggest property on the internet and the single biggest media company in the planet. This kicks off a 10-year period from 2012, when this happens, until ChatGPT on November 30th,
45:492022, when AI is already shaping the human existence for all of us and driving hundreds of billions of dollars of revenue, it's just in the YouTube feed. And then Facebook borrows it and they hire Jan LeCun and they start Facebook AI research. And then they bring it into Instagram and then TikTok and ByteDance take it. And then it goes back to Facebook and YouTube with reels and shorts. This is the primary way that humans on the planet spend their leisure time for the next 10 years. This is my favorite David Rosenthal-ism. Everyone talks about 2022 onward as the AI era. And I love this point from you that
46:24actually, for anyone that could make good use of a recommender system and a classifier system, basically in a company with a social feed, the AI era started in 2012. Yes, the AI era started in 2012. And part of it was the cat paper. The other part of it was what Jensen at NVIDIA always calls the big bang moment for AI, which was AlexNet. Yes. So we talked about Jeff Hinton. Back at the University of Toronto, he's got two grad students
46:57who he's working with in this era. Alex Krzyzewski and Ilya Sutskiver. Of course. Future co-founder and chief scientist of OpenAI. And the three of them are working with Jeff's deep neural network ideas and algorithms to create an entry for the famous ImageNet competition in computer science. This is Fei-Fei Li's thing from Stanford. It is a annual machine vision algorithm competition. And what it was, was Fei-Fei had assembled
47:36a database of 14 million images that were hand labeled. Famously, she used Mechanical Turk on Amazon, I think, to get them all hand labeled. Yes, I think that's right. And so then the competition was, what team can write the algorithm that without looking at the labels, so just seeing the images, could correctly identify the largest percentage? The best algorithms that would win the competitions year over year, we're still getting more than a quarter of the image is wrong. So like 75% success rate. Great. Way worse than a human.
48:10Can't use it for much in a production setting when quarter of the time you're wrong. So then the 2012 competition. Along comes AlexNet. Its error rate was 15%. Still high, but a 10% leap from the previous best being a 25% error rate all the way down to 15 in one year. A leap like that had never happened before. It's 40% better than the next best. Yes. On a relative basis. Yes. And why is it so much better, David? What did they figure out that would create a $4 trillion
48:44company in the future?
48:46So what Jeff and Alex and Ilya did is they knew, like we've been talking about all episode, that deep neural networks had all this potential and Moore's law advanced enough that you could use CPUs to create a few layers. They had the aha moment of what if we re-architected this stuff, not to run on CPUs, but to run on a whole different class of computer chips that were by their very
49:20nature, highly, highly, highly parallelizable video game graphics cards made by the leading company in the space at the time, NVIDIA. Not obvious at the time, and especially not obvious that this highly advanced cutting edge academic computer science research that was being done on supercomputers usually that was being done on supercomputers with incredible CPUs would use these toy video game cards.
49:50That retail for $1,000. Yeah, less at that point in time, a couple hundred bucks. So the team in Toronto, they go out to like the local Best Buy or something. They buy two NVIDIA GeForce GTX 580s, which were NVIDIA's top of the line gaming cards at the time. The Toronto team rewrites their neural network algorithms in CUDA, NVIDIA's programming language. They train it on these two off-the-shelf GTX 580s, and this is how
50:23they achieve their deep neural network and do 40% better than any other entry in the ImageNet competition. So when Jensen says that this was the big bang moment of artificial intelligence, A, he's right. This shows everybody that, holy crap, if you can do this with two off-the-shelf GTX 580s, imagine what you could do with more of them or with specialized chips. And B, this event is what sets NVIDIA on the path from a somewhat struggling PC gaming accessory maker to the leader of the AI
50:56wave and the most valuable company in the world today. And this is how AI research tends to work, is there's some breakthrough that gets you this big step change function. And then there's actually a multi-year process of optimizing from there, where you get these kind of diminishing returns curves on breakthroughs, where the first half of the advancement happens all at once, and then the second half takes many years after that to figure out. But it's rare and amazing and must be so cool when you have an idea, you do it, and then you
51:27realize, oh my god, I just found the next giant leap in the field. It's like I unlocked the next level, to use the video game analogy. I leveled up. So after AlexNet, the whole computer science world is abuzz. People are starting to stop doubting neural networks at this point. Yes. So after AlexNet, the three of them from Toronto, Jeff Hinton, Alex Kraszewski, and Ilya Sitskiver, do the natural thing. They start a company called DNN Research, Deep Neural Network Research. This company does not have any products. This company has AI researchers.
52:03Who just won a big competition. And, predictably, as you might imagine, it gets acquired by Google almost immediately. Oh, are you intentionally shortening this? That's what I thought the story was. Oh, it is not immediately. Oh, okay. There's a whole crazy thing that happens where the first bid is actually from Baidu. Oh, I did not know that. So Baidu offers $12 million. Jeff Hinton doesn't really know how to value the company and doesn't know if that's fair. And so he does what any academic would do to best determine the market
52:39value of the company. He says, thank you so much. I'm going to run an auction now. And I'm going to run it in a highly structured manner where every time anybody wants to bid, the clock resets and there's another hour where anybody else can submit another bid. No way. So. I didn't know this. This is crazy. He gets in touch with everyone that he knows from the research community who is now working at a big company who he thinks, hey, this would be a good place for us to do our research. That includes Baidu. That includes Google. That includes Microsoft. And there's one other.
53:16Facebook, of course. It's a two-year-old startup. Oh, wait. So it does not include Facebook? It does not include Facebook. Think about the year. This is 2012. So Facebook's not really in the AI game yet. They're still trying to build their own AI lab. Yeah, yeah. Because Jan LeCun and Fairwood start in 2013. Is it Instagram? Nope. It is the most important part of the end of this episode. Wait, well, it can't be Tesla because Tesla is older than that. Nope.
53:47Well, OpenAI wouldn't get founded for years. Wow. Okay. You really got me here. What company slightly predated OpenAI? Doing effectively the same mission. Oh. Of course. Of course. Hiding in plain sight. DeepMind. Wow. DeepMind, baby. They are the fourth bidder in a four-way auction for DNN research. Now, of course, right after the bidding starts, DeepMind has to drop out. They're a startup. They don't actually have the cash to be able to buy.
54:19Yeah. Didn't even cross my mind because my first question was like, where the hell would they get the money? Because they had no money. But Jeff Hinton already knows and respects Demis, even though he's just doing this, at the time, startup called DeepMind. That's amazing. Wait, how is DeepMind in the auction, but Facebook is not? Isn't that wild? That's wild. So the timing of this is concurrent with the, it was then called NIPS. Now it's called NeurIPS conference. So Jeff Hinton actually runs the auction from his hotel room at the
54:51Harrah's Casino in Lake Tahoe. Oh my God. Amazing. So the bids all come in and we got to thank Cade Metz, the author of Genius Makers. Great book on the whole history of AI that we're actually going to reference a lot in this episode. The bidding goes up and up and up. At some point, Microsoft drops out. They come back in, told you DeepMind drops out. So it's Baidu and Google really going at the end. And finally, at some point, the researchers look at each other and they say, where do we actually want to land? We want to land at Google. And so they stop the bidding at $44 million and just say,
55:22Google, this is more than enough money. We're going with you. Wow. I knew it was about $40 million. I did not know that whole story. Right. It's almost like Google itself and the Dutch auction IPO process. Right. How fitting. That's kind of a perfect DNA. Yes. Wow. And the three of them were supposed to split at 33 each. And Alex and Ilya go to Jeff and say, I really think you should have a bigger percent. I think you should have 40 percent and we should each have 30. And that's how it ends up breaking down. Ah, wow. What a team. Well, that leads to the three of them joining Google Brain directly and
55:59turbocharging everything going on there. Spoiler alert, a couple of years later, Astro Teller, who would take over running Google X after Sebastian Thrun left, he would get quoted in the New York Times in a profile of Google X that the gains to Google's core businesses in search and ads and YouTube from Google Brain have way more than funded all of the other bets that they have made within Google X and throughout the company over the years.
56:30Oh, it's one of these things that if you make something a few percent better that happens to do tens of billions of dollars or hundreds of billions of dollars in revenue, you find quite a bit of loose change in those couch cushions. Yes, quite, quite a bit of loose change. But that's not where the AI history ends within Google. There is another very important piece of the Google AI story that is an acquisition from outside of Google, the AI equivalent of Google's acquisition of YouTube. That's what we talked about a minute ago. Deep mind.
57:02Okay, listeners, now is a great time to tell you about a new friend of the show, we are very excited about, WorkOS. Yes, WorkOS is the enterprise-ready platform used by OpenAI, Cursor, Perplexity, Vercel, Plaid, and literally hundreds of other winning companies. So what are all these companies using WorkOS for? Imagine you're a fast-growing startup, you've got product market fit, and you're getting inbound interest from big enterprise customers. Very exciting. But then they send you their security questionnaire.
57:33Yep. And it's like 47 pages long with requirements that kind of sound like alphabet soup. Do you support SAML 2.0? Can you integrate with our Okta? Do you have SCIM provisioning, S-C-I-M? What about RBAC? And you're thinking, I have no idea what these acronyms even mean, let alone how to implement them. So here's the thing. These are not nice-to-haves. These are deal blockers. Without SSO, without SCIM, without RBAC, without audit logs, you simply cannot close enterprise deals, period.
58:08But none of these features make your core product better. They don't make your beer taste better, to use our favorite analogy here on Acquired. So if you're building like a design tool, spending six months building SAML authentication doesn't make your design tool more powerful. So this is where WorkOS comes in. They've built Stripe 4 Enterprise features. WorkOS turns enterprise authentication requirements into drop-in APIs, abstracting away as much unnecessary complexity as possible. So instead of your team spending months reading SAML specs, you can implement enterprise SSO in minutes.
58:42WorkOS handles user provisioning, permissions, audit logs, all the checkbox items that enterprise IT requires. So whether you are a seed stage company trying to land your first enterprise customer, or already big and expanding globally, WorkOS is the fastest path to becoming enterprise-ready. Just visit WorkOS.com or just message their Slack support. They have real engineers in there who answer questions fast. And when you get in touch, just tell them Ben and David sent you. All right, David. DeepMind. I kind of like your framing, the YouTube of AI.
59:16The YouTube of AI for Google. They bought this thing for, we'll talk about the purchase price, but it's worth, what, $500 billion today? I mean, this is as good as Instagram or YouTube in terms of greatest acquisitions of all time. 100%. So I remember when this deal happened, just like I remember when the Instagram deal happened. Because the number was big at the time. It was big, but I remember it for a different reason. It was like when Facebook bought Instagram, like, oh my God, this is, wow, what a tectonic shift in the landscape of tech.
59:49In January 2014, I remember reading on TechCrunch this random news. Right. You're like, deep what? That Google is spending a lot of money to buy something in London that I've never heard of. That's working on artificial intelligence, question mark? Right. This really illustrates how outside of mainstream tech AI was at the time. Yeah. And then you dig in a little further and you're like, this company doesn't seem to have any products. And it also doesn't even really say anything on its website about what DeepMind is.
1:00:24It says it is a, quote unquote, cutting edge artificial intelligence company. Wait, did you look this up on the Wayback Machine? I did. I did. Oh, nice. To build general purpose learning algorithms for simulations, e-commerce, and games. This is 2014. This does not compute, does not register. Simulations, e-commerce, and games. It's kind of a random spattering of... Exactly. It turns out, though, not only was that description of what DeepMind was fairly
1:00:55accurate, this company and this purchase of it by Google was the butterfly flapping its wings equivalent moment that directly leads to OpenAI, ChatGPT, Anthropic, and basically everything. Certainly Gemini. That we know. Yeah. Gemini directly in the world of AI today. And probably XAI, given Elon's involvement. Yeah, of course XAI. In a weird way, it sort of leads to Tesla self-driving, too, with Karpathy.
1:01:26Yeah, definitely. Okay, so what is the story here? DeepMind was founded in 2010 by a neuroscience PhD named Demis Hassabis. Who previously started a video game company? Oh, yeah. And a postdoc named Shane Legge at University College London. And a third co-founder who was one of Demis' friends from growing up, Mustafa Suleiman. This was unlikely, to say
1:02:00the least. This would go on to produce a Knight and Nobel Prize winner. Yes. So Demis, the CEO, was a childhood chess prodigy turned video game developer who, when he was age 17 in 1994, he had gotten accepted to the University of Cambridge, but he was too young and the university told him, hey, take a, you know, gap year, come back. He decided
1:02:31that he was going to go work at a video game developer, at a video game studio called Bullfrog Productions for the year. And while he's there, he created the game Theme Park, if you remember that. It was like a theme park version of SimCity. This was a big game. This was very commercially successful. Rollercoaster Tycoon would be sort of a clone of this that would have many, many sequels over the years. Oh, I played a ton of that. Yeah. It sells 15 million copies in the mid-90s. Wow. Wild. Then after this, he goes to Cambridge,
1:03:04studies computer science there. After Cambridge, he gets back into gaming, founds another game studio called Elixir. That would ultimately fail. And then he decides, you know what, I'm going to go get my PhD in neuroscience. And that is how Demis ends up at University College, London. There he meets Shane Legg, who's there as a postdoc. Shane is a self-described, at the time, member of the lunatic fringe in the AI community. In that he believes, this is 2008, 9, 10. He believes
1:03:41that AI is going to get more and more and more powerful every year. And that it will become so powerful that it will become more intelligent than humans. And Shane is one of the people who actually popularizes the term artificial general intelligence, AGI. Oh, interesting. Which, of course, lots of people talk about now, and approximately zero people were afraid of that. I mean, you had like the Nick Bostrom type folks, but very few people were
1:04:12thinking about superintelligence or the singularity or anything like that. For what it's worth, not Elon Musk. He's not included in that list because Demis would be the one who tells Elon about this. Yes, we'll get to it. So Demis and Shane hit it off. They pull in Mustafa, Demis's childhood friend who is himself extremely intelligent. He had gone to the University of Oxford and then dropped out, I think, at age 19 to do other startup-y type stuff. So the three of them
1:04:42decided to start a company, DeepMind. The name, of course, being a reference to deep learning, Jeff Hinton's work and everything coming out of the University of Toronto, and the goal that the three of these guys have of actually creating an intelligent mind with deep learning. Like, Jeff and Ilya and Alex aren't really thinking about this yet. As we said, this is lunatic fringe type stuff. Yes. AlexNet, the cat paper, that whole world is about better classifying data. Can we better sort into patterns? It's a giant leap from there to say, oh, we're going to create
1:05:13intelligence. Yes. I think probably some people, almost certainly at Google, were thinking, oh, we can create narrow intelligence that'll be better than humans at certain tasks. I mean, a calculator is better than humans at certain tasks. Right. But I don't think too many people were thinking, oh, this is going to be general intelligence smarter than humans. Right. So, they decide on the tagline for the company is going to be solve intelligence and use it to solve everything else. Ooh, I like it.
1:05:44I like it. Yeah, yeah, yeah. I mean, they're good marketers too, these guys. So, there's just one problem. To do what they want to do. Money. Just say it. Yeah, money. Money is the problem. Right, right, right. Money is the problem for lots of reasons, but even more so than any other given startup in the 2010 era, it's not like they can just go spin up an AWS instance and build an app and deploy it to the App Store. They want to build really, really, really, really, really big
1:06:18deep learning neural networks that requires Google-sized levels of compute. Well, it's interesting. Actually, they don't require that much funding yet. The AI of the time was go grab a few GPUs. We're not training giant LLMs. That's the ambition eventually. But right now, what they just need to do is raise a few million bucks. But who's going to give you a few million bucks when there's no business plan? When you're just trying to solve intelligence, you need to find some lunatics. It's a tough sell to VCs. Except for the exact right-
1:06:49As you say, they need to find some lunatics. Oh, I chose my words carefully, dude. Yeah. We use the term lunatic in- It's endearing-ish. Most endearing possible way here, given that they were all basically right. So, in June 2010, Demis and Shane managed to get invited to the Singularity Summit in San Francisco, California. Because they're not raising money for this in London. Yeah, definitely not. I think they tried for a couple months and learned that that was not going to be a viable path.
1:07:19Yes. The summit, the Singularity Summit, organized by Ray Kurzweil, a future Google employee, I think, chief futurist, noted futurist, Eliezer Yudkowsky, and Peter Thiel. Yes. So, Demis and Shane are excited about getting this invite. They're like, this is probably our one chance to get funded. But we probably shouldn't just walk in, guns blazing, and say, Peter, can we pitch you?
1:07:51Yeah. So, they finagle their way into Demis getting to give a talk on stage at the summit. Always the hack. They're like, this is great. This is going to be the hack. The talk is going to be our pitch to Peter and Founders Fund. Peter has just started Founders Fund at this point. You know, obviously, member of the PayPal mafia, very wealthy. I think he had a big Roth IRA at this point, is the right way to frame it. Big Roth IRA. That he had invested in Facebook, first investor in Facebook. He is the perfect target. They architected the presentation at the summit to be a pitch directly to Peter,
1:08:26essentially a thinly veiled pitch. Shane has a quote in Parmy Olson's great book, Supremacy, that we used as a source for a lot of this DeepMind story. And Shane says, we needed someone crazy enough to fund an AGI company. Somebody who had the resources not to sweat a few million and liked super ambitious stuff. They also had to be massively contrarian because every professor that he would go talk to would certainly tell him, absolutely do not even think about funding this.
1:08:57That Venn diagram sure sounds a lot like Peter Thiel. So, they show up at the conference. Demis is going to give the talk. Goes out on stage. He looks out into the audience. Peter is not there. Turns out, Peter wasn't actually that involved in the conference. Yeah, he's a busy guy. He's a co-founder, co-organizer, but he's a busy guy. Yes. Guy's like, shoot. Oh, we missed our chance. What are we going to do? And then, Fortune turns in their favor. They find out that Peter is hosting an after party that night
1:09:30at his house in San Francisco. They get into the party. Demis seeks out Peter. And Demis is very, very, very smart. As anybody who's ever listened to him talk would immediately know. He's like, rather than just pitching Peter head on, I'm going to come about this obliquely. He starts talking to Peter about chess because he knows, as everybody does, that Peter Thiel loves chess. And Demis had been the second highest ranked player in the world as a teenager in the under 14
1:10:00category. Good strategy. Great strategy. The man knows his chess moves. So Peter's like, hmm, I like you. You seem smart. What do you do? And Demis explains. He's got this AGI startup. They were actually here. He gave a talk on stage as part of the conference. People are excited about this. And Peter says, oh, okay. All right. Come back to Founders Fund tomorrow and give me the pitch. So they do. They make the pitch. It goes well. Founders Fund leads DeepMind's seed round of about $2 million.
1:10:30My times have changed for AI company seed rounds these days. Oh, yes. Imagine leading DeepMind's seed round with less than $2 million check. And through Peter and Founders Fund, they get introduced. Hey, Elon, you should meet this guy. To another member of the PayPal mafia, Elon Musk. Yes. So it's teed up in a pretty low key way. Hey, Elon, you should meet this guy. He's smart. He's thinking about artificial intelligence. So Elon says, great. Come over to SpaceX. I'll
1:11:03give you the tour of the place. So Demis comes over for lunch and a tour of the factory. Of course, Demis thinks it's very cool, but really he's trying to reorient the conversation over to artificial intelligence. And I'll read this great excerpt from an article in The Guardian. Musk told Hassabis his priority was getting to Mars as a backup planet in case something went wrong here. I don't think he'd thought much about AI at this point. Hassabis pointed out a flaw in his plan. I said, what if AI was the thing that went wrong here? Then being on Mars
1:11:35wouldn't help you because if we got there, then it would obviously be easy for an AI to get there through our communication systems or whatever it was. He hadn't thought about that. So he sat there for a minute without saying anything, just sort of thinking, hmm, that's probably true. Shortly after, Musk too became an investor in DeepMind. Yes. Yes, yes. I think it's crazy that Demis is sort of the one that woke Elon up to this idea of we might not be safe from the AI on Mars either. Right, right. I hadn't considered that.
1:12:09So this is the first time the bit flips for Elon of we really need to figure out a safe, secure AI for the good of the people. That sort of seed being planted in his head. Yep. Which of course is what DeepMind's ambition is. We are here doing research for the good of humanity like scientists in a peer-reviewed way. Yep. I think all that is true. Also, in the intervening months to year after this meeting
1:12:40between Demis and Elon and Elon investing in DeepMind, Elon also starts to get really, really excited and convinced about the capabilities of AI in the near term. And specifically the capabilities of AI for Tesla. Yes. Like with everything else in Elon's world, once the bit flips and he becomes interested, he completely changes the way he views the world, completely sheds all the old ways and actions that he was taking. And it's all about what do I most do to embrace this new worldview that I
1:13:14have. And other people have been working on for a while already by this point, AI driving cars. Yep. That sounds like it would be a pretty good idea for Tesla. It does. So Elon starts trying to recruit as many AI researchers as he possibly can and machine vision and machine learning experts into Tesla. And then AlexNet happens. And man, AlexNet's really, really, really good at identifying and classifying images and cat videos on YouTube and the YouTube
1:13:49recommender feed. Well, is that really that different from a live feed of video from a car that's being driven and understanding what's going on there? Can we process it in real time and look at differences between frames? Perhaps controlling the car? Not all that different. So Elon's excitement, channeled initially through DeepMind and Demis about AI and AI for Tesla, starts ratcheting up big time. Yep. Meanwhile, back in London, DeepMind is getting to work. They're hiring researchers. They're
1:14:24getting to work on models. They're making some vague noises about products to their investors. Maybe we could do something in shopping, maybe something in gaming, like the description on the website at the time of acquisition said. But mostly what they really, really want to do is just build these models and work on intelligence. And then one day in late 2013, they get a call from Mark Zuckerberg. He wants to buy the company. Mark has woken up to everything that's going on at
1:14:57Google after AlexNet and what AI is doing for social media feed recommendations at YouTube, the possibility of what it can do at Facebook and for Instagram. He's gone out and recruited Jan LeCun, Jeff Hinton's old postdoc, who's together with Jeff, one of the sort of godfathers of AI and deep learning. And really popularized the idea of convolutional neural networks, the next hot thing in the field of AI at this point in time. And so with Jan, they have created FAIR, Facebook AI Research, which is a Google brain rival within
1:15:33Facebook. And remember who the first investor in Facebook was, who's still on the board. Peter Thiel. And is also the lead investor in DeepMind. Where do you think Mark learned about DeepMind? Peter Thiel. Was it, do you know for sure that it was from Peter? No, I don't know for sure. But like, how else could Mark have learned about this startup in London? I've got a great story of how Larry Page found out about it. Oh, okay. Well, we'll get to that in one second. So Mark calls and offers to buy the company. And there are various rumors of how much Mark offered. But according to Parmi Olson in her book, Supremacy, the reports are that it was up to $800
1:16:08million, a company with no products and a long way from AGI. That squares with what Cade Metz has in his book, that the founders would have made about twice as much money from taking Facebook's offer versus taking Google's offer. Yep. So, Demis, of course, takes this news to the investor group. Which, by the way, is kind of against everything the company was founded on. The whole aim of the company and what he's promised the team is that DeepMind is going to stay independent, do research, publish in the scientific community. We're not going to be sort of captured and told what to do by the
1:16:43whims of a capitalist institution. Yep. So, definitely some deal point negotiating that has to happen with Mark and Facebook if this offer is going to come through. But Mark is so desperate at this point. He is open to these very large deal point negotiations, such as Jan LeCun gets to stay in New York. Jan LeCun gets to stay operating his lab at NYU. Jan LeCun is a professor. He's flexible on some things. Turns out, Mark is not flexible on letting Demis keep control of DeepMind if he buys it. Demis sort of argued for we need to stay
1:17:15separate and carved out and we need this independent oversight board with his ability to intervene if the mission of DeepMind is no longer being followed. And Mark's like, no, you'll be a part of Facebook. Yeah. And you'll make a lot of money. So, as this negotiation is going on, of course, the investors in DeepMind get wind of this. Elon finds out about what's going on. He immediately calls up Demis and says, I will buy the company right now with Tesla stock. This is late 2013, like early 2014.
1:17:50Tesla's market cap is about $20 billion. So, Tesla stock from then to today is about a 70X run up.
1:18:02Demis and Shane and Mustafa are like, wow, okay, there's a lot going on right now. But to your point, they have the same issues with Elon and Tesla that they had with Mark. Elon wants them to come in and work on autonomous driving for Tesla. They don't want to work on autonomous driving. Right. Or at least exclusively. At least exclusively. Yep. So, then Demis gets a third call from Larry Page. Do you want my story of how Larry knows about the company?
1:18:34I absolutely want your story of how Larry knows about the company. All right. So, this is still early in DeepMind's life. We haven't progressed all the way to this acquisition point yet. Apparently, Elon Musk is on a private jet with Luke Nosek, who's another member of the PayPal mafia and an angel investor in DeepMind. And they're reading an email from Demis with an update about a breakthrough that they had where DeepMind AI figured out a clever way to win at the Atari game breakout. Yes. And the strategy it figured out with no human training was that you could bounce the ball up
1:19:08around the edges of the bricks and then without needing to intervene, it could bounce around along the top and win the game faster without you needing to have a whole bunch of interactions with the paddle down at the bottom. They're watching this video of how clever it is and flying with them on the same private plane is Larry Page. Of course, because Elon and Larry used to be very good friends. Yes. And Larry is like, wait, what are you watching? What company is this? And that's how he finds out.
1:19:39Wow. Yes. Elon must have been so angry about all this. And the crazy thing is this kinship between Larry and Demis is, I think, the reason why the deal gets done at Google. Once the two of them get together, they are like peas in a pod. Larry has always viewed Google as an AI company. Yeah. Demis, of course, views DeepMind so much as an AI company that he doesn't even want to make any products until they can get to AGI.
1:20:10And Demis, in fact, we should share with listeners, Demis told us this when we were talking to him to prep for this episode, just felt like Larry got it. Larry was completely on board with the mission of everything that DeepMind was doing. And there's something else very convenient about Google. They already have brain. So Larry doesn't need Demis and Shane and Mustafa and DeepMind to come work on products within Google. Brain is already working on products within Google. Demis can really believe Larry when
1:20:42Larry says, nah, stay in London. Keep working on intelligence. Do what you're doing. I don't need you to come work on products within Google. Brain is like actively going and engaging with the product groups, trying to figure out, hey, how can we deploy neural nets into your product to make it better? That's like their reason for being. So they're happy to agree to this. And it's working. Brain and neural nets are getting integrated into search, into ads, into Gmail, into everything. It is the perfect home for DeepMind. Home away from home, shall we say. Yes. And, and there's a third reason why Google's the perfect
1:21:16fit for DeepMind. Infrastructure. Google has all the compute infrastructure you could ever want right there on tap. Yes. At least with CPUs so far. Yes. So how's the deal actually happen? Well, after buying DNN research, Alan Eustace, who David, you spoke with, right? Yep. Was Google's head of engineering at the time. He makes up his mind that he wanted to hire all the best deep learning research talent that he possibly could. And he had a clear path to do so. A few months earlier, Larry Page held a strategy meeting on an island in the South Pacific. In Cade Metz's book,
1:21:51it's an undisclosed island. Of course he did. Larry thought that deep learning was going to completely change the whole industry. And so he tells his team, this is a quote, let's really go big, which effectively gave Alan a blank check to go secure all the best researchers that he possibly could. So in 2013, he decides I'm going to get on a plane in December before the holidays and go meet DeepMind. Crazy story about this. Jeff Hinton, who's at Google at the time, had a thing with his back where he couldn't sit down. He either has to stand or lay. And so a long flight across the ocean is not
1:22:23doable, but he needs to be there as a part of the diligence process. You have Jeff Hinton, you need to use him to figure out if you're going to buy a deep learning company. And so Alan Eustace decides he's going to charter a private jet and he's going to build this crazy custom harness rig so that Jeff Hinton won't be sliding around when he's laying on the floor during takeoff and landing. Wow. I was thinking the first part of this, I'm pretty sure Google has planes. They could just get into Google Play. For whatever reason, this was a separate charter. But it's not solvable just with
1:22:56a private plane. You need also a harness. Right. And Alan is the guy who set the record for jumping out of the world's highest, was it a balloon? I actually don't know. The highest free fall jump that anyone has