How Google (almost) squandered generative AI
Loonshots, tech leviathans and risk aversion
TL;DR
This newsletter is about Google's slow start in the generative AI race and how this came about. It looks the invention of the transformer, how OpenAI ended up taking advantage of it before Google, and the future of Google and search in the wake of generative AI.
Here are the key takeaways:
Peter Thiel argues that companies stuck in fierce competition with other companies lack the ability to be truly innovative. Only when they achieve monopoly status can they afford think beyond what their competitors are doing and come up with new things.
Conversely, Safi Bachall argues that when companies get too big they lose the ability to pursue such innovations. Bigger organisations inevitably hit a phase transition whereby the incentive to take risks and pursue new breakthroughs is significantly weaker.
The story of how the transformer was invented perhaps proves Bachall's thesis more than Thiel's. This story describes how despite a team of Google researchers being the creators of the transformer, it was OpenAI that was able to realise its potential for generative AI.
This slow start has put Google in a position of playing catch-up in the current AI race. And with the advent of large language models, some may question how much longer the company will be able to maintain its dominance in search.
"Competition is for losers."
In Zero to One, Peter Thiel presents the idea that businesses too concerned with competition are losing businesses. This is because, by being a monopoly and not having to worry about your competitors, you can afford to think about things other than generating revenue.
He cites Google as an example:
Since it doesn't have to worry about competing with anyone, it has wider latitude to care about its workers, its product, and its impact on the wider world.1
Therefore, in Thiel's view, perfect competition should not be the aim:
In perfect competition, a business is so focused on today's margins that it can't possibly plan for a long-term future. Only one thing can allow a business to transcend the daily brute struggle for survival: monopoly profits.2
In essence, Thiel believes that, if you are stuck in fierce competition with others, you predominantly focus on those who are competing with. It leaves little room for real innovation that takes you from '0 to 1', and you are instead limited to iterations that merely beat the next best company.
Only when you are a monopoly can you afford to be truly innovative.
Loonshots
In his book Loonshots, Safi Bahcall provides his theory on the conditions required for organisations to come up with innovations. His basic argument is as follows:
The most important breakthroughs come from loonshots, which are widely dismissed ideas.
History has shown that large groups are needed to turn these loonshots into technologies that win wars, products that save lives or strategies that change industries.
By applying the science of phase transitions to the behavioural dynamics of teams, we can identify practical rules for nurturing loonshots faster and better.
When Bachall talks about phase transitions, he is referring to a phenomenon whereby an organisation goes from having an environment of innovation to one of stagnation. Innovative organisations are fluid like water, whereas stagnant innovations are rigid like ice.
I have written in a previous post about the factors that, according to Bachall, determine when this transition from innovation to stagnation takes place:
Salary growth. As an organisation grows, the salary increases that come with promotions will also likely increase. The bigger the salary growth, the greater the incentive to focus on career progression.
Management span. Bigger organisations usually come with structures where the gap between junior levels and senior levels is wider. It also means narrower management spans, where the number of reports a manager has decreases.
Equity fraction. This is about compensation being closely tied to the quality of work. If you do good work, your stake in the future income generated by the company grows, incentivising greater focus on loonshots rather than individual career progression.
Organisational fitness. High fitness organisations match employee skills and roles very well. Low fitness organisations do not do this well and politics strongly influences promotion decisions.
So, in general, an organisation with lower salary growth, wider management spans, higher equity fractions and better organisational fitness is more innovative. Stagnant organisations are the opposite.
Smaller organisations tend to exhibit more of the innovative qualities, as Bachall explains:
When groups are small, for example, everyone’s stake in the outcome of the group project is high. At a small biotech, if the drug works, everyone will be a hero and a millionaire. If it fails, everyone will be looking for a job. The perks of rank — job titles or the increase in salary from being promoted, are small compared to those high stakes.3
Conversely, bigger organisations tend to be more stagnant:
As teams and companies grow larger, the stakes in outcome decrease while the perks of rank increase. When the two cross, the system snaps. Incentives begin encouraging behaviour no one wants. Those same groups —with the same people — begin rejecting loonshots.”4
Bachall argues that as organisations grow, they inevitably experience this phase transition. They may start out churning lots of loonshots, but eventually they slow down, become less fluid and produce less loonshots over time.
Google and the transformer
Thiel and Bachall present different ideas on innovation within companies.
Thiel argues that the real key to innovation is to become a monopoly. On the other hand, Bachall argues that the bigger a company gets, the more likely it is to create an environment that is not conducive to continued innovation.
The story of how the transformer was invented, and how OpenAI took advantage of it at the expense of Google, shows that Bachall might be right. Maybe Google has entered a phase transition that limits its ability to take advantage of loonshots.
In her new book, Supremacy: AI, ChatGPT and the Race That Will Change the World, Parmy Olson describes how a group of Google researchers invented the transformer, and how it was OpenAI that was able to realise its potential for generative AI.
The story begins with Illia Polosukhin, Ashish Vaswani and Jakob Uszkoreit, whose work as AI researchers in Google led to the development of the transformer. The spark for this came from the lacking performance of Google Translate pre-2017, and whether it could be improved.
At that time, Google Translate was using a model architecture called a recurrent neural network (RNN). This model would take input sentences and process the words one at a time in a sequential manner to produce the output.
For example, when translating an English sentence, the model would process each English word and translate it into another language. This process would be repeated for each English word in the input sentence.
By processing the input in this way, the model was able to pick up on word order in a given sentence. However, it had its limitations, and did not particularly perform well with larger paragraphs.
This limitation was due to the fact that, for longer inputs, "the information at the start of the sequence might be lost in the process of compressing everything to a single, fixed representation."5
In light of this, Polosukhin and others wondered if products like Google Translate could be improved by using a different architecture. They wondered if the RNN could be replaced with something else.
Their experiments eventually led them to thinking about the concept of "attention". They started running experiments over several months, whereby they would remove the 'recurrent' from the neural network and replace it with other mechanisms, including the attention mechanism.
Attention in machine learning is quite simple. It is essentially about looking at every word in the given text data to understand the relevant context.
By processing the input in this way, rather than looking at each word at a time, the model is able to gauge the correct context that in turn helps to identify the correct meaning of the sentence.
The attention mechanism assigns a score (attention score) to each word. This score represents the level of attention that each word should be given to understand the context conveyed by the input.
The work of Polosukhin and others would eventually attract Noam Shazeer, another well-known researcher at Google. The group grew to eight in number as they continued the development of a new architecture that they would call the transformer (an architecture that can 'transform' any input into any output).
Replacing the RNN with the transformer produced much better results. The team noticed how the model was able to correctly interpret the meaning of certain words in a given sentence based on the context of the whole sentence, something that RNNs had previously struggled with.
In writing up their findings, the team settled on the infamous title for their paper: 'Attention is All You Need'. This was coined by Llion Jones, a Welsh programmer on the research team.
But even after the publication of the paper, Google was reluctant to build on top of the transformer and improve its existing products and services with it:
Some of [the researchers] tried to show Google the bigger possibilities of what the transformer was capable of. Not long after their paper was published, Shazeer started working with a colleague to use the technology on a new chatbot called Meena. They trained it on the public internet and eventually came to believe it would revolutionize how people searched the web and used computers. Meena was so sophisticated that it could improvise puns or banter with a human just as easily as it could hold a philosophical debate.
Shazeer and his colleagues were excited about what they had just created, and they tried sending details of the bot to external researchers, hoping to launch a public demo and improve the clunky Google Assistant that people had in their homes in the form of a speaker with something much more sophisticated. But Google executives stopped those efforts. They worried that the bot would make outlandish remarks that hurt Google's reputation or, more specifically, its $100 billion digital advertising business. According to a report in the Wall Street Journal, they thwarted every attempt by Shazeer to launch Meena to the public or build it into Google products.6
As Olson reveals here, Google's problem was that it seemed too focused on preserving its ad business. It considered this to be "sacrosanct":
You didn't mess with the algorithms that powered it unless you really had to.7
So although pockets of innovation were taking place within Google, whether those innovations were treated with sufficient priority was a different matter:
...when a group of researchers at Google made one of the most important discoveries in artificial intelligence in the past decade, the search company left it to languish internally. Their story in a nutshell showed how the monopolistic scale of Big Tech firms crimped their ability to invent, forcing them to react to other people's innovations by copying and buying them outright. But this particular negligence was worse for Google. In the end, OpenAI not only capitalized on Google's big invention, it used that invention to launch the first viable threat to the search giant in years.8
Despite what Thiel may argue, it seems that Google's monopoly status made them complacent and too risk-averse. It seemed to lack the incentive to pursue loonshots:
The company's sheer size and fixation on growth meant that its researchers or engineers often had to wade through several layers of management to sign off on even small ideas. And with virtually no competition, since Google controlled about 90 percent of all the world's searches, there was no urgency to innovate.9
Shazeer would eventually leave after being frustrated by Google's reluctance on transformers. Google would, however, continue his work on Meena, leading to the creation of LaMDA (Language Model for Dialogue Applications).
But even when this research resulted in better models, Google kept a lid on it:
As exciting as these advancements were, Google needed to keep everything confined to its internal bubble - LaMDA was probably the world's most advanced chatbot, but only a few people inside Google could use it. Google was loathe to release any new technology that could end up disrupting the success of its search business.10
The publication of 'Attention is All You Need', and Google's apprehension on transformer-based language models, presented OpenAI with the opportunity to trigger the next AI hype cycle. The lack of competition ended up making Google a loser.
Within OpenAI, the Google transformer research peaked the interests of Ilya Sutskever and Alec Radford. Both saw the potential of the transformer for not only building models that understood text, but also models that could generate text:
When the transformer came out, [Radford] saw it at first as a crushing blow from Google. Clearly the bigger company had more expertise in AI. But after a while, it looked like Google didn't have any big plans for its new invention, and Radford and Sutskever realized thet could use the architecture to OpenAI's advantage. They would just have to put their own spin on it.11
The 'spin' that OpenAI would come up with was to remove the encoder from the original transformer architecture, leaving only the decoder. In addition to this, they trained the model on huge datasets of text, powered by Sutskever's belief that greater performance can be achieved by scaling everything up.
This work eventually led to the creation of the company's first generative language model, which it called the 'Generative Pre-trained Transformer' (GPT):12
Thanks to the transformer, Radford was making more progress with his language model experiments in two weeks than over the previous two years. He and his colleagues started working on a new language model they called a "generatively pre-trained transformer" or GPT for short. They trained in on an online corpus of about seven thousand of mostly self-published books found on the internet, many of them skewed toward romance and vampire fiction. Plenty of AI scientists had used this same dataset, too, known as BooksCorpus, and anyone could download it for free. Radford and his team believed they had all the right ingredients to ensure that this time, their model would also be able to infer context.13
Since the launch of OpenAI's ChatGPT, Google has been churning out language models of its own despite its slow start. And its dominance in search remains intact even after Microsoft's efforts with incorporating LLMs into its Bing search engine using OpenAI's IP (which has not been all that successful).
But Microsoft is continuing to integrate LLMs across a wide suite of its products. Its AI-powered Copilot service now forms a major part of its software offering.
Google has also been making similar moves, though not all has gone so smoothly. For example, Google had to pull the image-generation capabilities of Gemini after it was accused of being too 'woke' with its outputs (which I wrote about previously).
Perhaps the biggest threat to Google is that LLMs could provide an alternative way to navigate the digital world, reducing the reliance on search engines. People may turn to products like ChatGPT as some sort of advanced digital assistant that delivers the information they need in a more convenient fashion.
This is, however, definitely contingent on such LLMs proving sufficiently reliable, and the remaining problems with hallucinations among other things are proof that this is not the case. But if (and this is a big if) such problems can be resolved, what becomes of search engines?
If AI makes it possible to give search users direct answers to more of their questions, it could reduce the need to click on the links and adverts that Google displays. This problem helps explain why Pichai has been cautious in adopting big changes to its search engine, analysts and challengers say.
[...]
But the longer it takes to wholeheartedly adopt generative AI, the greater the risk that internet users will turn to rival chatbots or other AI-powered services.
Peter Thiel, Zero to One: Notes on Startups, or How to Build the Future (Penguin Random House 2014), p.31.
Peter Thiel, Zero to One: Notes on Startups, or How to Build the Future (Penguin Random House 2014), p.32.
Safi Bachall, Loonshots: How to Nurture the Crazy Ideas That Win Wars, Cure Diseases, and Transform Industries (St Martin's Press 2019), p.13.
Safi Bachall, Loonshots: How to Nurture the Crazy Ideas That Win Wars, Cure Diseases, and Transform Industries (St Martin's Press 2019), p.13.
'Chapter 1: Hello Transformers' in Lewis Tunstall et al, Natural Language Processing With Transformers: Building Language Applications with Hugging Face (O'Reilly Media 2022).
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024) pp.144-145.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), p.137.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), p.137.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), p.146.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), pp.147-148.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), p.153.
See also their paper titled 'Improving Language Understanding by Generative Pre-Training’.
Parmy Olson, Supremacy: AI, ChatGPT and the Race That Will Change the World (St Martin's Press 2024), pp.154-155.