The landscape of technology has experienced a profound transformation over the past decade. In 2011, Marc Andreessen famously declared that "software is eating the world."
In recent years, we have witnessed another transformative wave: generative AI. Some of the fastest-adopted software products in history now fall under the umbrella of generative AI.
To put this into perspective, it took Facebook two years to reach 100 million users, and it took cell phones a staggering 16 years to achieve the same milestone. In contrast, ChatGPT, a prime example of generative AI, achieved the same feat in just two months. This remarkable adoption rate signifies the meteoric rise of generative AI.
Foundations of Artificial Intelligence
But what exactly is generative AI, and why is it causing such a stir? Before diving deeper into this revolutionary technology, let's explore the foundational concepts that underpin AI.
At its core, artificial intelligence (AI) is not just about code and computations. It's a fusion of mathematics, data, and algorithms that endow machines with cognitive abilities. Much like human intelligence, AI enables systems to recognize patterns, draw inferences, make decisions, and even exhibit rudimentary emotional understanding. The ubiquity of AI in our modern world, from smart homes to business analytics, attests to its significance.
Machine Learning, Deep Learning and Neural Networks: The Cornerstone, Profound Depths, and Computational Backbone of AI
Machine learning (ML) stands as one of the most celebrated subsets of AI. At its heart, ML seeks to enable machines to learn from experience rather than relying on explicitly defined rules. This is akin to how a child learns to differentiate between cats and dogs, refining their understanding with each encounter. The fundamental shift from rigid programming to adaptable learning forms the essence of ML.
Within the vast ocean of ML, deep learning dives deep. It harnesses artificial neural networks, inspired by the interconnected neurons in the human brain, to explore datasets' nuances that often elude other methods. Deep learning has found immense success in tasks like image and speech recognition, where context and fine details are critical.
Within AI's processing power lies the neural network. Just as neurons in the human brain process and transmit information, these artificial nodes identify patterns, make decisions, and tackle tasks ranging from identifying shapes to translating languages.
Generative AI: a Wave Moving at Hyperspeed, but why now?
Generative AI, often referred to as Gen AI, represents AI's creative frontier. From art and music to textual narratives and lifelike videos, Gen AI empowers machines to produce imaginative content. It's akin to giving a brush to a machine and marveling at the masterpieces it creates.
Gen AI's current momentum can be attributed to the same factors driving progress in AI as a whole: improved models lead to larger datasets thanks to increased computing power. To understand the present moment better, let's look at recent history in generative AI development:
I. Wave 1: Small Models Reign Supreme (Pre - 2015): Over five years ago, small models were considered cutting-edge for language understanding. While excelling in analytical tasks like delivery time prediction and fraud classification, these models fell short in generative capabilities. Producing human-level writing or code seemed like a distant goal.
II. Wave 2: The Race to Scale (2015 - 2021): A significant breakthrough came with Google Research's introduction of transformers, a new neural network architecture for natural language understanding. Transformers allowed the creation of high-quality language models that were more parallelizable and quicker to train. These models, known as few-shot learners, could be customized for specific domains with relative ease. The computing power used to train these models increased dramatically between 2015 and 2020, achieving superhuman results in various tasks like handwriting recognition, speech processing, and language understanding. Notably, OpenAI's GPT-3 marked a substantial leap in performance compared to GPT-2.
Despite these advances, these models faced limitations in accessibility, availability, and cost, making them less widespread.
III. Wave 3: Better, Faster, Cheaper (2021 - present): In this phase, compute costs decreased, and new techniques like diffusion models reduced the expenses associated with training and inference. The research community continued to refine algorithms and develop larger models. Developer access expanded from closed beta to open beta or open source, making these models more accessible.
With the platform layer stabilizing, models improving in quality and efficiency, and broader access for developers, the application layer begins testing and optimizing with consumer behavior. Similarly, we anticipate the emergence of killer applications in the field of Generative AI, marking a new frontier of possibilities predicated on the increased utilization of the technology by the population globally.
Large Language Models (LLMs): the Linguistic Powerhouse of Gen AI
Trained on vast textual datasets, they not only understand language but also generate it. However, they go beyond syntactic correctness; they grasp context, sentiment, and nuances, akin to a seasoned writer or poet. But how to distinguish LLMs from Neural Networks?
LLMs, while a subset of neural networks, have distinct characteristics in terms of scope, functionality, architecture, and training paradigms to neural networks:
I. Scope and Purpose: LLMs are specifically designed for comprehending and generating human language (e.g., GPT-4, PaLM2), while neural networks serve various applications, from image recognition to speech processing.
II. Functionality: LLMs excel in tasks like text generation, translation, and sentiment analysis, whereas neural networks are versatile for a wide range of applications.
III. Architecture: LLMs primarily employ transformer architectures with self-attention mechanisms, while neural networks offer diverse architectures suited to different tasks.
IV. Training Paradigms: LLMs undergo extensive pre-training on massive text datasets and can be fine-tuned, whereas neural networks are typically trained specifically for assigned tasks, without always requiring broad pre-training.
Large language models have made astonishing progress, akin to an astounding acceleration in air travel speed. If we draw an analogy, it's as if the average flight speed went from 600 mph in 2018 to a staggering 900,000 mph in 2020, a 1,500-fold increase in just two years. Such a leap would reduce the London to New York travel time from eight hours to a mere 19 seconds.
The monumental progress in LLMs is primarily attributed to their enormous parameter counts. For example, GPT-4 boasts a staggering 1.7 trillion parameters. This makes it 1000 times larger than GPT-2 and nearly 10 times larger than GPT-3, which had 1.5 billion and 175 billion parameters respectively.
These rapid developments are made possible through Moore’s Law, signaling LLMs are not privy to the same laws of physics as an airplane.
To truly grasp LLMs, we need to delve into their inner mechanisms. LLMs are fundamentally language models that comprehend language by representing words as numerical vectors. These vectors capture relationships between words and are essential for understanding context and meaning.
LLMs employ transformer models, composed of layers that progressively enhance their understanding of text. Initial layers focus on syntax and ambiguity resolution, while later layers grasp holistic comprehension. The most powerful LLMs, like GPT-4, have numerous layers with word vectors spanning thousands of dimensions.
As LLMs process text, they adapt word vectors at each layer to accumulate information and context. For instance, if a character named "Rex" is the main character in a story, the model may adjust the "Rex" vector in each layer to include information like "main character," "lives in New York," "from Arizona," and so forth.
LLMs transform words into numerical representations and adapt these representations at each layer to understand context and meaning. Their ability to decipher nuanced language concepts is evident in tasks like word disambiguation and theory of mind reasoning.
Beyond their mechanics, LLMs offer four potential new frameworks for applications:
I. Making Impossible Problems Possible: LLMs can tackle tasks that are challenging for humans by translating them into solvable language problems.
II. Making Frustrating Problems Easy and Convenient: LLMs simplify cumbersome tasks, automating complex interactions and decision-making processes
III. Vertical AI Leading to Vertical SaaS: Specialized LLMs focus on specific industry problems, providing tailored solutions and improving data quality.
IV. Enhancing Productivity and Creativity: LLMs act as AI co-pilots, boosting human productivity and creativity across various fields, transforming industries and pricing models.
The Chatbot Arena serves as a premier benchmarking platform for LLMs. After 27,000 contests, OpenAI's GPT-4emerged as the champion. Intriguingly, Anthropic, a company established by former OpenAI members, secured the second and third positions with its LLM named Claude. Vicuna-13B, an open-source model, ranked seventh, highlighting open-source models' competitive potential.
Gen AI Applications Today
The most actionable applications today reinforce the idea that “AI is here to help, not to do.” This has been most clear with Microsoft’s implementation with its CoPilot product. The technology will be additive not distracting. This can be found true with numerous other use cases below:
I. Text Generation: Gen AI leads in advanced natural language generation, with models excelling in producing short-to-medium-form content. Anticipate higher quality and longer content generation with tailored industry applications.
II. Code Generation: Notable advancements boost developer productivity, exemplified by GitHub CoPilot, democratizing creative code use.
III. Image Generation: Recent developments have made image generation popular, especially on platforms like Twitter, featuring diverse aesthetic styles and image editing techniques.
IV. Speech Synthesis: Proficient consumer and enterprise applications, including film and podcasts, set high standards for human-like speech. Current models serve as starting points for further refinement.
V. Video and 3D Models: Rapidly advancing in creative sectors such as cinema, gaming, virtual reality, architecture, and product design, with research organizations driving exciting developments.
VI. Expansive Domains: Beyond text, code, images, and speech, Gen AI extends to audio, music, biology, and chemistry, with ongoing research in proteins and molecules.
Gen AI Frontiers for Tomorrow
Gen AI frontier use cases reinforce more of a utopic view that “AI is here to do.” These use cases tie more directly to the sensationalized headlines around job destruction and a platform for McKinsey research like “Generative AI and the future of work in America.”
I. Copywriting: Gen AI meets the rising demand for personalized web and email content to drive sales, marketing, and customer support efficiently.
II. Vertical-Specific Writing Assistants: Tailored solutions for industries like legal contract drafting or screenwriting by fine-tuning models and workflows.
III. Democratizing Coding: Potential to make coding accessible to consumers by using prompt-based programming languages alongside current productivity tools.
IV. Art Generation: Access to vast knowledge of art history and styles empowers users to explore various themes with ease.
V. Gaming: While aiming for natural language-driven game scenes, immediate applications include generating textures and skybox art for enhanced gaming experiences.
VI. Media/Advertising: Automation and real-time optimization for persuasive advertisements combining messages and visuals.
VII. Design Revolution: Gen AI rapidly creates high-fidelity designs from rough sketches, impacting industries from digital products to physical objects.
VIII. Social Media and Digital Communities: Generative tools foster new self-expression avenues, as seen in emerging applications like Midjourney, reshaping online interactions.
Market Mapping: Existing Infrastructure vs. Bespoke Architecture
The visualization below in exhibit 3, provides an overview of the AI landscape in 2023, highlighting key players and sectors. We see a broad range of AI-driven applications, spanning from everyday entertainment and productivity tools to specialized enterprise solutions.
A crucial component is the robust infrastructure that supports AI, encompassing everything from deployment platforms to data storage and labeling. Notably, certain organizations like OpenAI and AWS stand out as major contributors. It also underscores the importance of industry-specific AI solutions and the pivotal role of open-source platforms in advancing the field.
AI companies, targeting either consumers or enterprises, frequently weigh the benefits of using existing infrastructure. Available platforms offer a proven, efficient, and scalable foundation. However, in a saturated market, relying on common infrastructure prompts the question: how can a company's solution truly differentiate?
When assessing the value of applications built on existing infrastructures as an investor, it's vital to determine where the competitive advantage or "moat" lies. While building on existing platforms can speed up development and reduce upfront costs, it also raises questions about differentiation, long-term sustainability, and barriers to entry:
I. Differentiation and Value Proposition: If many applications are built on the same underlying infrastructure, how does a particular application stand out? The moat could be in the unique datasets they possess, the domain-specific expertise they offer, the seamless user experience they provide, or other value-added features not intrinsic to the base platform.
II. Scalability and Adaptability: Using existing infrastructure often means scalability is baked in. However, investors should assess how adaptable these applications are. Can they pivot or expand their offerings quickly, or are they limited by the constraints of the base platform?
III. Monetization and Revenue Models: While the commoditization of applications on shared platforms can be a concern, it's essential to assess their monetization strategy. Is their revenue model robust, diversified, and sustainable? Or are they heavily reliant on a single revenue stream that could be disrupted?
IV. Network Effects: Some applications can create a moat by leveraging network effects. As more users adopt the application, its value and utility grow, making it harder for new entrants to compete. For example, a chatbot built on GPT that gains a large user base can collect more data, refine its algorithms, and provide better services than newer competitors.
V. Customizations and Integrations: Even if an application is built on an existing platform, the degree of customization and integration with other systems can be a significant competitive advantage. An application that seamlessly integrates with a broader ecosystem of services can be more valuable than standalone solutions.
VI. Data Privacy and Ethical Considerations: As ML and AI applications come under scrutiny for ethical concerns, those that prioritize data privacy, unbiased algorithms, and transparent operations can create a moat through trust and reliability.
VII. Brand and Community: A strong brand identity and an engaged community can serve as barriers to entry. Applications that foster active user communities, encourage feedback, and continuously evolve based on user needs can create lasting loyalty and deter competitors.
While the underlying infrastructure of an application provides a foundation, the real value and competitive advantage often lies in how that foundation is built. As an investor, it's crucial to look beyond the surface and assess the depth of innovation, the strength of the business model, and the potential for sustainable growth. While existing infrastructure offers a head start, the long-term success of an application hinges on its ability to carve out a unique space in the market and continuously evolve to meet user needs.
Business Strategies: The Open AI Paradigm vs. Custom Models
The AI landscape witnesses both industry giants like OpenAI and agile challengers from the open-source community. Businesses face a choice: harness established giants' advanced AI via APIs or create bespoke models for data privacy and tailored solutions.
This decision hinges on factors like operational maturity, resource availability, strategic priorities, and data sensitivity for an underlying company.
Building your own model offers several advantages. First, it allows for customization, ensuring that the model precisely fits your specific requirements by tailoring the architecture, data, and training process to your problem. Additionally, you retain complete intellectual property rights, enhancing ownership and control over the technology. For sensitive data, in-house development grants complete control over data access and storage, ensuring data privacy and compliance. Moreover, a proprietary model can become a Unique Selling Proposition (USP), especially if it offers distinct advantages over generic models.
However, there are drawbacks to consider when building your own model. It can be costly in terms of time, money, and resources, demanding a team of AI experts that may be challenging to recruit and retain. Developing a model from scratch can also be time-consuming, potentially delaying time-to-market. Furthermore, it may sometimes be redundant as many existing models have already solved similar problems. Additionally, training deep learning models requires substantial computational resources.
Alternatively, building on an established infrastructure presents its own set of advantages. Leveraging existing models allows for quicker development and deployment. Established models are typically well-tested and benchmarked, ensuring their reliability. Moreover, it can be cost-effective to use pre-trained models or platforms, saving time and resources. Popular platforms often come with a large community of users, offering support, tools, and extensions. They also provide continuous updates, improvements, and bug fixes, ensuring that the technology remains state-of-the-art.
However, there are some downsides to relying on established infrastructure. Pre-built models might lack the flexibility required to perfectly tailor them to specific needs. There's also a dependency risk; relying on third-party platforms can pose issues if the provider changes their terms, pricing, or discontinues the service. Data privacy concerns may arise when using external platforms that require data sharing, potentially compromising data privacy. Lastly, while many platforms offer free or low-cost options initially, expenses can escalate significantly as usage scales up.
The choice between building a bespoke model versus leveraging established platforms depends on the specific requirements of the project, available resources, and strategic considerations.
Comparing Hype Cycles
"The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man." - George Bernard Shaw, Man and Superman
Throughout the 17th, 18th, 19th, 20th, and 21st centuries, a wave of technology revolutions followed predictable boom-and-bust cycles. In Carlota Perez's "Technological Revolutions and Financial Capital: The Dynamics of Bubbles and Golden Ages," she highlights the various phases throughout a cycle and the basic - nearly identical - chronology across technology revolutions dating back to the 17th century with the Industrial Revolution in Britain.
Each period of time has the same phases (despite different innovations), and each "bust" cycle downplays the tangible value of the formative innovation within each period. Still, it is a reminder that human psychology and behavior drive irrational thinking. We look to discriminate the tenure of a "technology revolution" over 50 years compared to the hype cycles of mini revolutions throughout a technological era. The distinction ties dramatically to the general perspective that many in technology envision as the beginning of the "Age of AI."
Technology Revolution Cycles: From Start to Finish
Technology revolutions follow cyclical and predictable boom-and-bust cycles.
Many of these revolutions start from the same pillars that are mixed in their own unique cocktail for each new revolution:
I. Technological Innovation: All progress and growth start from the idea and creation of new technologies and ideas that enhance the overall quality of life and boost productivity.
II. Economic Cycles: Economies constantly go through cycles of expansion, and this was no exception over the last 20 years with perpetually low-interest rates.
III. Financial Cycles: Low-interest rates provide endless access to capital across risk assets and specifically high-growth sectors, with the last 20 years being synonymously characterized by Marc Andreessen's slogan, "Software is eating the world" as alluded to in our introduction.
IV. Social & Institutional Changes: Society quickly assimilates to new cultural norms, including people's preferences, regulations, and societal structures.
V. Creative Destruction: Older industries and practices become less relevant as the new technology proliferates.
Generative AI radically differs from more minor microcosm revolutions in the "Information Era" like mobile but marks an entirely different shape shift across society, businesses, and longer-term economic growth.
These characteristics are evident when looking back at the Age of Information and Telecommunications, and despite AI's adolescence, we are already seeing many of these traits in this new age. In fact, we asked Chat GPT-4 to provide its response for each pillar below:
These pillars are replicated across every previous technological revolution, from the Industrial Revolution (1771 – 1829) to the Age of Information (1971 - 2021). Once a cycle starts, they all go through similar troughs and peaks.
Any new technological revolution begins with a "big bang" moment that kicks off the first major phase: the installation phase. The installation phase begins the emergence and early adoption characterized by high levels of excitement, speculation, and a rapid inflow of financial capital.
This phase begins a flywheel that spins until it reaches terminal velocity until it ultimately malfunctions and bursts.
The installation phase typically builds until it reaches "the turning point," frequently representing a severe market crash. While there are more minor crashes along the way, these salient events typically stand out in history, like the railway mania in the 1840s or the dot-com boom in the late 1990s. These crashes usually occur halfway through the cycle, which historically has been 20-25 years.
While, in hindsight, these crashes seem like screaming buys and opportunistic trades, the consensus is frequently clouded. Following the crash in 2002, the consensus view was that the internet era was over.
Amazon's stock in December 1999 was $106.69; by April 2, 2001, the stock was down 92% to $8.37 per share. According to Brad Stone in The Everything Store:
"Early in 2000… Ruth Porat, co-head of Morgan Stanley's global-technology group, advised him (Warren Jenson, Amazon's new CFO) to tap into the European market, and so in February, Amazon sold $672 million in convertible bonds to overseas investors. This time, with the stock market fluctuating and the global economy tipping into recession, the process wasn't as easy as the previous fundraising had been. Amazon was forced to offer a far more generous 6.9 percent interest rate and flexible conversion terms."
The consensus across Wall Street and the public markets was clouded and mixed for Amazon, and their business model was questioned for years following the crash despite it being widely viewed as one of the most valuable and best-run companies in the world today.
Following the crash, the technological era enters into a deployment phase, fully integrating with society and the economy. This period is when consumers fully realize the full potential and synergies of said technology throughout society. For the information age, this represented the widespread adoption of the internet in the 2000s and the arrival of ancillary technologies like mobile and the cloud. These fundamental shifts become clear by just highlighting the most valuable companies in any given era.
Gartner Hype Cycle: Microcosms Within a Larger Cycle
The Gartner Hype Cycle was first introduced by Gartner in 1995 by Jackie Fenn and Mark Raskino. Unlike the technological revolution cycles highlighted earlier, the premise of the research is to underline a predictable pattern of various stages for new technologies that have shown promise in any given year.
Each hype cycle drills down into five key phases consisting of 1) innovation trigger, 2) peak of inflated expectations, 3) trough of disillusionment, 4) slope of enlightenment, and 5) plateau of productivity.
In the most recent annual 2023 Gartner hype cycle report, generative AI was placed on “peak of inflated expectations” for the first time.
According to Gartner’s research leader Arun Chandrasekaran,
“The main reason AI has hit the peak of the hype cycle is the sheer number of products claiming to have generative AI baked into them - a real proliferation in generative AI models, both in the closed-source ecosystem as well as in the open-source ecosystem.”
This viewpoint is exacerbated by company commentary across not just technology companies but all major industrial sectors. In Q1 2023, 75% of all of commercial services mentioned AI on their earnings call. With mentions of AI also across information technology (66%), Industrials (23%), Consumer discretionary (22%), Financials (19%), Healthcare, Real Estate, Materials, and Consumer Staples. The only industries that did not cite AI were Energy and Utilities.
The outsized pressure from companies across industries to showcase a strategy for generative AI is imperative. However, the monetary results of said strategies have been articulated by many management teams as something that will not be material in the near-term.
When Yamini Rangan, CEO of Hubspot, was asked by sell-side research in Q2 2023 how generative AI will speed up development for enterprise-grade features internally, he responded,
"Now, Gen AI certainly has the potential to accelerate productivity, not just in front-office functions, but also for engineering and certainly something that we are looking at, but it's pretty early days in terms of driving the pace of innovation even further."
While there have been other companies that have taken a different tact on hyping their "AI" product suite, like C3.AI and Palantir Technologies, these businesses have not illustrated any additional revenue from new AI product lines, and their full-year guidance has not been revised up despite "insatiable demand" from its customers. The reality around AI "product" revenue (vs. infrastructure) is that this is still in adolescence.
Tying Hype Cycles to Venture Capital Investment
In tracing the evolution of venture capital investment, it becomes evident that the technology industry has been subject to a series of hype cycles, each characterized by waves of euphoria and substantial financial inflows. These cycles, in their distinctive narratives, have significantly impacted the entrepreneurial landscape and venture capital deployment over the years:
I. Biotech and Genomics (Late 1990s-Early 2000s): The late 1990s witnessed a remarkable confluence of scientific breakthroughs and financial fervor. Sequencing the human genome and the promise of innovative drug development spurred massive venture capital investments. However, the initial optimism collided with the complexities of scientific research and regulatory challenges, resulting in a sobering correction in valuations.
II. CleanTech (Mid-2000s - Late 2000s): The mid-2000s marked a pivotal moment in the venture capital landscape, driven by a heightened environmental consciousness. Leading venture capitalists such as John Doerr of Kleiner Perkins ventured into CleanTech, directing substantial resources toward clean energy and sustainability startups. Yet, as these startups grappled with the formidable challenges of scaling and technological maturity, the euphoria subsided, and the CleanTech sector faced a period of introspection.
III. Social Media (Mid-2000s - Early 2010s): The advent of social media platforms during this era redefined the way people interacted and communicated. Facebook, Twitter, LinkedIn, and others became cultural phenomena, with investors pouring funds into these platforms. This period witnessed exponential growth, global user adoption, and unprecedented valuations, fundamentally altering the digital landscape.
IV. Mobile Apps (Late 2000s - Early 2010s): The late 2000s saw the rise of the smartphone, transforming how individuals accessed information and services. This shift catalyzed a gold rush in mobile app development, where startups raced to provide innovative solutions. The winners in this race not only revolutionized user experiences but also set the stage for a mobile-dominated ecosystem that persists today.
V. Big Data and Analytics (Mid 2010s): Amidst the proliferation of data, investors rallied behind the idea that "software is eating the world." The mid-2010s witnessed a proliferation of SaaS companies harnessing big data and analytics to offer novel insights. These companies played a pivotal role in reshaping industries and ushering in an era of data-driven decision-making.
VI. Blockchain and Cryptocurrency (Late 2010s): The late 2010s were marked by a paradigm shift in financial technology. The belief that cryptocurrency could disrupt traditional financial systems catalyzed a wave of enthusiasm and innovation. Blockchain technology emerged as a cornerstone of decentralized applications, challenging centralization. This daring thesis led to a proliferation of startups and projects, accompanied by intense speculation and regulatory scrutiny.
While there were early winners in prior hype/compute cycles, technology revolution takes time before creating the "killer" applications. The result is a graveyard of earlier applications and companies that failed to gain adoption or were well-equipped to adopt to rapid advancements and deployments of said technology.
Over the course of previous hype cycles, total cumulative valuation grew across a 6-7-year period by an average CAGR of 161%. The result is an endless wave of new startups each year with various developments that best reflect the full capabilities of the compute technology.
The smartphone/mobile computing cycle started with the Apple iPhone release in 2007. However, the founding of the first wave of category-defining applications like Uber (2009), Instagram (2010), and WhatsApp (2009) took 2-3 years to master the consumerization of the cycle. This consumerization was directly correlated to consumers' broad usage of the technology.
While ChatGPT has been well publicized and hyped, according to Pew Research, only 18% Americans have used it despite 54% of the population having heard of it. The usage data among that 18% also indicates utility has yet to fully materialize, with monthly traffic and unique visitors showing declines in June 2023.
While the product growth has been impressive, it also indicates that broad usage and applicability on the product side still has time to evolve and fully develop.
Current Fundraising Environment: Public vs. Private
Like previous hype cycles, venture capital has been looking to deploy capital rapidly despite a severe pullback systemically across the venture capital asset class. Despite rising interest rates, frozen capital markets, and other technology verticals seeing collapses of total invested capital by over 85%, exuberance has not been shirked in these early stages of AI development.
On the public market side, the only exposure for investors is through the infrastructure layer - cloud service providers (Google, AWS, Azure), Hardware (Nividia, Intel, AMD), and open-source frameworks (LLaMA / Facebook).
Given the clear distinction we have already previously made between the infrastructure and application layers, we look to delineate between the two in our funding analysis.
Private Market Investment Trends
We have analyzed numerous research reports and fundraising highlights regarding “generative AI” and we believe many of these reports have more noise versus real insights that can provide clarity to investment within the space.
Exhibit 11 represents the generative AI market landscape summary according to Pitchbook. The information is scattered when not filtered appropriately. We looked to define and disseminate the venture universe three distinct company categories: infrastructure, applications, and legacy “data-first” startups that are well positioned for this new era of GenAI.
I. Infrastructure: “Picks & Shovels”
As we allude to in earlier sections, the infrastructure layer is critical for inference and training today. This includes sub-verticals starting with full-stack LLMs (OpenAI, Anthropic, Cohere, and Inflection) to training, store & compute, and deploy & monitoring.
When analyzing this cohort specifically, you will see that the lion-share of investment is centralized here in exhibit 12.
This infrastructure layer will continue to develop as various pain points and use cases materialize. Not unlike cloud infrastructure / service providers, this is extraordinarily capital intensive and requires significant upfront capital from any new entrant and has strong first-mover advantages. This was evident in the “cloud compute” era with AWS versus Azure, Google Cloud, and Oracle’s cloud solutions.
Unsurprisingly, this is why Google, Microsoft, Snowflake, and Databricks are quickly building out an AI strategy to best situate themselves as a clear-cut infrastructure layer for this next wave of computing.
II. Consumer Applications & Legacy “data-first” Applications
Once you remove the infrastructure investment from the equation, the investments on the consumer-side are more modest - on a relative basis - but still represent a Cambrian explosion in 2021. From 2021 to 2023 YTD, $11.8 billion was deployed into the generative AI application layer, representing 75% of total invested capital since 2012.
The other cohort worth mentioning is “data-first”private companies. This includes data warehouse companies like Databricks (who acquired MosaicML in June 2023) but also businesses like Gong (sales & customer support), Ironclad (general & administrative support), and Grammarly (general & administrative support). Many of these businesses structured their business model as true “man and machine symbiosis” solution, creating strong data feedback loops to best assist and solve problems leveraging machine learning. With new infrastructure to rapidly improve these loops, these companies effectively become generative AI companies in themselves.
This exemplifies that there were many established entities now recognized as generative AI companies that were building technologies for the last decade. Their journey to success has been a gradual one, mirroring the evolution of generative AI technology over time.
As a prime illustration, Lightricks, the creative force behind the renowned Facetune app, commenced its journey in 2013. Tabnine, specializing in code completion, laid its foundation in 2012, while Replika, an AI chatbot tailored for personal conversations, saw its inception in 2013.
Fast forward to 2017, and the AI landscape witnessed the ascent of enterprises like Woebot Health, dedicated to mental health chatbots, astute sales firms exemplified by Cresta, and providers of cutting-edge deep learning software tools, such as Weights & Biases. Remarkably, each of these companies has succeeded in amassing substantial investments in the subsequent years.
III. Private VC-Backed Company Case Studies
Given our general perspective that the consumer application layer is still premature, and today’s winners have a high probability of not being tomorrow’s winners, we tended to focus on highlighting critical AI infrastructure businesses today.
Other Full-Stack Large Language Models (LLMs)
There are a number of competing full stack LLM companies that have sprung up behind OpenAI. Like OpenAI, these companies have received sizable investments from current technology incumbents like Microsoft, Google, Nvidia, Oracle, SAP.
These founding teams also include many of the early employees from Google’s Deepmind / Google Mind research teams (Anthropic, Cohere, Inflection) and OpenAI’s developmental team. A situation that is again analogous to previous technological revolution “mafia regimes” like the “Fairchild’s Traitorous Eight” in 1957 and the "Paypal mafia" born in 1999.
Other Key Infrastructure Stack Private Companies
Public Market Generative AI
While private market investors have the capability to invest aggressively in an endless array of new startups coining themselves as generative AI, public markets have had limited inventory to aggressively bet on over the course of the last 24 months.
The major benefactor has been Nvidia, who is the leader for semiconductor / chip hardware that is powering LLM models. The other businesses that have meaningfully benefited are Microsoft, Google, Amazon, and Meta. Given the capital intensity of generative AI and the level of engineering talent needed, these incumbents have been able to articulate the clearest monetization strategy to Wall Street in the near-term.
Nvidia has cited that there is about $1trillion of spend going into data center spend, which Nvidia is only capturing in mid-single digits today in terms of revenue. Management has cited that they are in position to capture 45% of that total serviceable market ($300B Chips & Systems Serviceable Addressable Market + $150B AI Software).
This massive capital cycle has already become apparent in the company’s earnings results in the first and second quarter of 2023, with the company disclosing data center revenue of $10.3B in Q2, representing a 171% increase year-over-year (YoY), and beating consensus estimates by over $2 billion ($10.3B vs. $8B consensus).
However, the company has largely been flat since this earnings report as the next wave of the capital cycle remains an overhanging question among investors. A rapidly ramping CAPEX investment cycle - as evidenced in the 1H of 2023 by Nvidia's datacenter customers - can lead to a relatively short period of elevated growth and can lead to an extended “growth hangover” unless the capital cycle continues to accelerate. To put more succinctly, compounding at the level the company experienced in the 1H of 2023 is harder and harder to replicate as scale accelerates, leading to a higher risk of decaying growth rates.
The result is that public markets have been sidelined until the data has crystallized for forward looking projections.
The combination of high stakes, consumer exuberance, and broad accessibility of the technology has driven a goldilocks effect for a new technology era. This perfect storm has led to a race to the bottom, and the technology becoming ubiquitous faster than previous technology cycles.
In previous cycles when the technology becomes commoditized, the technology shifts froms being a differentiator for new entrants to becoming table stakes over time. During the mobile revolution, building out an application on the app store allowed you to differentiate your product; now a mobile application is required to just remain competitive (“table stakes”).
This idea to rapidly race for technological differentiation is best encapsulated by Reid Hoffman with his theory of “Blitzscaling.” The core idea is to prioritize speed and growth to maintain a market advantage in the modern, tech-driven economy.
The commoditization speed for this cycle – based on the broad technological availability of generative AI already – challenges this very theory. Incumbents have displayed they can offer and aggressively market the same AI technology to their existing user base exhibiting clear-cut challenges to many “AI” startups that have yet to establish a user base and/or qualified customers. Distribution is costly and takes time, creating - what feels like - daunting business plans for many companies. This was seen in the cloud applications with Slack vs. Microsoft Teams and is materializing already in this cycle with Midjourney vs. Adobe Firefly. Midjourney was launched in July 2022 and Adobe launched an identical product in March 2023.
While we do not deny that Era of AI has begun, the current landscape suggests that utilization still has not hit a saturation level to dictate application use cases leading a world with unproven exuberance.
Christopher Walsh, Philip Koch, and Chat GPT-4
Mullany, M. (n.d.). 8 Lessons from 20 Years of Hype Cycles. LinkedIn. https://www.linkedin.com/pulse/8-lessons-from-20-years-hype-cycles-michael-mullany/
Dixon, C. (2019, January 8). Strong and Weak Technologies. cdixon.org. https://cdixon.org/2019/01/08/strong-and-weak-technologies
Evans, B. (n.d.). Presentation Slides 80-92. ben-evans.com.https://www.ben-evans.com/presentations
Woodbury, R. (2023). The Mobile Revolution vs. the AI Revolution. Digital Native.https://www.digitalnative.tech/p/the-mobile-revolution-vs-the-ai-revolution
Eugenie Park, Risa Gelles-Watnick. 2023, August 28. Most Americans Haven't Used ChatGPT; Few Think It Will Have a Major Impact on Their Job. Pew Research Center.https://www.pewresearch.org/short-reads/2023/08/28/most-americans-havent-used-chatgpt-few-think-it-will-have-a-major-impact-on-theirjob/#:~:text=Who%20has%20used%20ChatGPT%3F,adults%20to%20have%20used%20ChatGPT
Ashish Vaswani and Noam Shazeer. 2017. Attention is All You Need. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Michael Mignano. 2023. Startups vs. Incumbents: The Battle for AI's Application Layer. Lightspeed Venture Partners. https://lsvp.com/startups-v-incumbents-the-battle-for-ais-application-layer/
Gartner. 2023. What’s New in the 2023 Gartner Hype Cycle forEmerging Technologies. Gartner. https://www.gartner.com/en/articles/what-s-new-in-the-2023-gartner-hype-cycle-for-emerging-technologies
Goldman, S. 2023. Gartner Hype Cycle places generative AI on the ‘Peak of Inflated Expectations’. VentureBeat. https://venturebeat.com/ai/gartner-hype-cycle-places-generative-ai-on-the-peak-of-inflated-expectations/
Butters, J. 2023, May 26. Highest Number of S&P 500 Companies Citing “AI” on Q1 Earnings Calls in over 10 Years. FactSet. https://insight.factset.com/highest-number-of-sp-500-companies-citing-ai-on-q1-earnings-calls-in-over-10-years
Nvidia. Glossary Term: Data Science - Generative AI. NVIDIA. https://www.nvidia.com/en-us/glossary/data-science/generative-ai/
Sonya Huang, Pat Grady. 2022. Generative AI: A Creative NewWorld. Sequoia Capital.https://www.sequoiacap.com/article/generative-ai-a-creative-new-world/
Sonya Huang, Pat Grady. 2023. Generative AI: Act Two.Sequoia Capital. https://www.sequoiacap.com/article/generative-ai-act-two/
Goutham, G. 2023. The Landscape of Generative AI.Medium.https://ramsrigoutham.medium.com/the-landscape-of-generative-ai-landscape-reports-615a417b15d
Woodbury, R. 2023. Generative AI Landscape. DigitalTransformation in Infrastructure.https://dgtlinfra.com/generative-ai-landscape/
Martin Casado, Sarah Wayne. 2023. The Economic Case forGenerative AI and Foundation Models. a16z.https://a16z.com/the-economic-case-for-generative-ai-and-foundation-models/2023. The Evolution of Machine Learning Infrastructure. Sequoia Capital.https://www.sequoiacap.com/article/the-evolution-of-machine-learning-infrastructure/
Buhler, K. 2023. Title of the Article. SequoiaCapital. https://www.sequoiacap.com/article/ai-50-2023/
Matt Bornstein, Rajko Radovanovic. 2023. Emerging Architectures for LLM Applications. a16z.https://a16z.com/emerging-architectures-for-llm-applications/