The State Of AI Infrastructure: Demand, Costs, And Custom Silicon

Spending On AI Infrastructure Has Exploded

Demand for accelerated compute has exploded in the three years since the launch of ChatGPT. Nvidia’s annual revenue has soared nearly 8-fold, from $27 billion in 2022 to $216 billion in 2025,1 with consensus estimates up another 62% to $350 billion in 2026.2 Global growth in data center systems investment—the compute, networking, and storage hardware—has accelerated from 5% at an annual rate in the ten years ended 2022 to 30% in the last three years, and is likely to increase more than 30% to $653 billion in 2026.3 Note: “CAGR” = Compound Annual Growth Rate. Source: ARK Investment Management LLC, 2026, based on data from Gartner 2026 and TheNextPlatform 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. ARK’s research suggests that accelerated computing—powered by graphic processing units (GPUs) and AI application-specific integrated circuits (ASICs), as opposed to general purpose central processing units (CPUs)—now dominates server investment, representing 86% of compute server sales, as shown below. Source: ARK Investment Management LLC, 2026, based on data from TheNextPlatform 2025 and company filings. In addition to those sources, certain information presented may be the result of ARK’s internal analyses, which draw on various additional sources of information. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security.

Plummeting Costs Are Fueling Adoption

Supercharging spending on the accelerated compute infrastructure necessary to run AI models are the growing adoption of generative AI for both consumer and business use cases and the demand to train ever-smarter foundation models in the quest for “superintelligence.”4 Rapidly falling costs are turbocharging the demand. According to our research, AI training costs have been falling 75% per year.5 Inference costs are falling faster, as the median cost decline for models that score better than 50% on benchmarks tracked by Artificial Analysis has been 95% at an annual rate, as shown below. Source: ARK Investment Management LLC, 2026, based on data from Artificial Analysis 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Two forces have combined to drive steep cost declines: (1) generation-on-generation hardware improvements led by industry leaders like Nvidia that are releasing new products every year, and (2) algorithmic improvements in the software layer to make training and inference on the same hardware more efficient.

Both Consumers And Enterprises Are Sending Strong Demand Signals

Consumers have adopted AI significantly faster than they did the internet. AI penetration scaled to ~20% in three years, more than twice as fast as consumers gravitated to the internet, as shown below. Source: ARK Investment Management LLC, 2026, based on data from SimilarWeb 2025, SensorTower 2025, and The World Bank 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Enterprise demand is also growing at a torrid pace. As measured by OpenRouter and shown below, token demand has risen 28x since December 2024. Source: ARK Investment Management LLC, 2026, based on data from OpenRouter 2026. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security In the past two years, Anthropic—the AI lab most associated with enterprise demand—has scaled revenues an astonishing ~100-fold, from $100 million ending run-rate revenue in 2023 to an estimated $8-$10 billion ending run-rate in 2025.6 Anthropic’s meteoric rise has continued in 2026, as the company announced reaching $14 billion in run-rate revenue along with a $30 billion fundraise at a $380 billion valuation in February.7 OpenAI, competing on both consumer and enterprise fronts, has also seen strong adoption among business users, reaching 1 million business customers as of November of 2025.8 According to CFO Sarah Friar, OpenAI’s enterprise revenue is growing faster than their consumer business, and is expected to reach 50% of company revenue in 2026.9 Illustrating the justification for further infrastructure investment, Friar outlined in a January 2026 blog that OpenAI’s revenue has scaled directly in line with their compute capacity over the last three years.10

Private Markets Are Financing The AI Buildout

Significant investment in infrastructure has been necessary to satisfy the strong demand signals. According to Crunchbase, in 2025 private AI labs raised more than $200 billion,11 ~$80 billion of which went to foundation model builders like OpenAI, Anthropic, and xAI. In the public markets, hyperscaler12 companies are eating into their cash hoards and seeking alternative forms of financing to fund their AI CAPEX (capital expenditures) plans, which could reach $700 billion in 2026.13 Reportedly, Meta’s $30 billion deal with Blue Owl was the largest private capital transaction ever completed.14 Structured as a joint venture, and funded primarily with debt, its special purpose vehicle (SPV) structure will keep the project’s debt off of Meta’s balance sheet,15 a deal that has attracted significant scrutiny.

AMD And Others Emerge As Credible Challengers To Nvidia

Beyond physical data centers, compute has dominated AI CAPEX. Nvidia has been at the vanguard of the accelerated compute age, but now the largest purchasers of AI chips are trying to increase their AI capability per dollar of investment. Since its acquisition of ATI Technologies in 2006, Advanced Micro Devices, Inc. (AMD) has been selling GPUs alongside Nvidia in the consumer market and now is an emerging competitor in the enterprise space. It also has gained share in the server CPU market—from nearly zero in 2017 to 40% in 2025—since launching its EPYC line of processors in 2017.16 AMD GPUs now are competitive with Nvidia on small model inference based on performance relative to total cost of ownership (TCO), as shown below. TCO incorporates both the upfront cost of a chip (CAPEX) and the cost of operating the chip over its useful life (OPEX). The performance benchmark is SemiAnalysis’s InferenceMax, measured in tokens per second processed per GPU when optimized for throughput. The cost benchmark is SemiAnalysis’s hourly CAPEX and OPEX estimates. Source: ARK Investment Management LLC, 2026, based on data from SemiAnalysis as of December 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Past performance is not indicative of future results. While AMD has “caught up” in small model performance, Nvidia maintains a significant lead in large model performance, as shown below. Source: ARK Investment Management LLC, 2026, based on data from SemiAnalysis 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Past performance is not indicative of future results. Nvidia’s rack-scale solution, Grace Blackwell, networks 72 Grace Blackwell GPUs (GB200) to function as a single massive GPU with shared memory. This tight chip-to-chip network bolsters large model inference which splits model weights across multiple GPUs and requires more communication than for small models. In hopes of closing the gap ahead of Nvidia’s Vera Rubin release, AMD’s rack-scale solution is scheduled to hit the market in the second half of 2026. Thus far, AMD has scored wins at Microsoft, Meta, OpenAI, xAI, and Oracle.

Hyperscalers Have Launched The Custom Silicon Revolution

In addition to merchant GPU vendors, hyperscalers and AI labs are hoping to keep Nvidia in check by building in-house chips to lower their AI compute costs. For more than a decade,17 Google has designed its own AI ASIC, the tensor processing unit (TPU), to run the recommender models for its search business and, recently, has optimized the performance of its latest generation, the TPU v7, for generative AI. SemiAnalysis estimates that Google could cut its cost per computation relative to Nvidia by 62% with its own TPU for internal workloads.18 Suggesting that the 62% estimate might be close to the mark, Anthropic and Meta are using Google’s TPUs to expand their compute footprint.19 Amazon’s Trainium chips seem to be the next most advanced solutions. After its acquisition of Annapurna Labs in 2015, Amazon pioneered custom silicon for its cloud business and scaled its ARM-based Graviton CPU and Nitro Data Processing Unit (DPU) to power a significant portion of Amazon Web Services (AWS). Recently, it announced that in 2025, for the third consecutive year, Graviton powered more than half the CPU capacity added to AWS.20 In addition to its use of TPUs, Anthropic also uses AWS and Trainium as its preferred training platform.21 Meanwhile, late to the custom silicon movement in 2023, Microsoft announced its AI accelerator Maia 100 without a generative AI focus, and is just now rolling out its second version, focused on AI inference.22

Broadcom Is Dominating Custom Silicon Enablement

While Google and Amazon have focused on front-end chip design—architecture and functionality—back-end design partners have translated their logic into silicon, managed advanced packaging, and coordinated manufacturing with a foundry partner like Taiwan Semiconductor Manufacturing Company (TSMC). While TSMC has been the go-to partner for most major AI silicon projects in the face of Intel’s foundry challenges, Broadcom has become the leading back-end design partner for Google’s TPU, Meta’s MTIA, and OpenAI’s upcoming custom chip in 2026. Famously handling the full front-to-back design process for its phone and PC chips on its own, even Apple might be working with Broadcom on its AI chips.23 Citi suggests that Broadcom’s AI revenue could grow five-fold in the next two years, from $20 billion in 2025 to $100 billion in 2027.24 Amazon Trainium’s journey appears unique among peers, reportedly in partnership with Marvell for Trainium2, and then, in response to Marvell’s poor execution, with Alchip for Trainium3 and Trainium4.25 That Amazon could swap out back-end partners suggests that vertical integration is a risk for companies in Broadcom’s position. Notably, Apple and Tesla work directly with their foundry partners. Google might do so as well with its TPU v8, which has two SKUs, one co-designed by Broadcom and the other designed and controlled by Google with support from MediaTek.

Chip Startup Activity Is Heating Up

Our research suggests that a long tail of startups experimenting with new architectural paradigms could further challenge the incumbent base of chip vendors. Cerebras, famous for its wafer scale engine—a giant chip made of a single silicon wafer the size of a pizza box—offers the fastest tokens-per-second on the market and reportedly is looking to go public this year. The company recently announced Codex Spark, a high-speed coding model, in partnership with OpenAI on the back of a deal they struck in January.26 Groq, also with superior performance based on tokens-per-second, recently signed a $20 billion non-exclusive licensing deal for its intellectual property (IP) with Nvidia.27 The deal included 90% of Groq’s employees, including CEO and TPU co-founder Jonathan Ross. Effectively a buy-out of Groq’s team and technology, this deal structure is gaining popularity in mergers and acquisitions (M&A) as big tech tries to avoid delays associated with regulatory oversight. Elsewhere on the acquisition front, Intel recently partnered with SambaNova after acquisition talks reportedly failed.28 Infamously, the company has failed to bring a widely successful AI product to market despite four acquisitions in the space dating back to 2014.

Looking Ahead: $1.4 Trillion By 2030

According to our research, demand and performance gains during the next five years will drive growth in AI software and cloud services,29 tripling the spending on AI infrastructure during the next five years, from $500 billion in 202530 to nearly $1.5 trillion in 2030, as shown below. Source: ARK Investment Management LLC, 2026, based on data from Gartner 2025, IDC 2025, TheNextPlatform 2025. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Forecasts are inherently limited and cannot be relied upon. We arrive at this forecast through our observations of data center system investment relative to software revenue over time. Systems investment was ~50% of global software spending during the early 2010s as the cloud scaled. By 2021, over-investment and customer optimizations post-COVID dropped systems investment down to the low 20% range relative to software.31 Our $1.5 trillion forecast assumes 2030 investment at 20% of our midpoint scenario for global software spend of $7 trillion in 2030, which we detailed in a blog last year.32 We feel the 20% level adequately accounts for potential over-investment by 2030, or the chance that software revenue uptake is slower than our midpoint scenario would suggest, in which case we believe infrastructure investment would continue at high rates as it did in the early 2010s. As AI-driven compute demands grow, we expect custom silicon to grow as a share of compute spend, as the time and money required to design workload specific chips will yield performance per dollar advantages that matter more at scale. We believe custom ASICs could grow to over a third of the compute market by 2030, as shown below. Source: ARK Investment Management LLC, 2026, based on data from TheNextPlatform 2025 and company filings. In addition to those sources, certain information presented may be the result of ARK’s internal analyses, which draw on various additional sources of information. For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Past performance is not indicative of future results. Forecasts are inherently limited and cannot be relied upon. In conclusion, our research suggests that the infrastructure buildout underway today is not a bubble waiting to burst, but rather the foundation of a once-in-a-generation platform shift. ARK's forecast of nearly $1.5 trillion in annual AI infrastructure spending by 2030 reflects a market driven by genuine, accelerating demand from both consumers and enterprises, validated by falling costs that continuously unlock new use cases. We believe the companies that win the next five years will be those that can design the most efficient silicon, build the most capable models, and deploy both at scale. As Jensen Huang elucidated in Nvidia’s fourth-quarter fiscal year 2026 earnings call, useful AI agents have just begun to roll out in the last several months.33 They are token hungry, but much more capable than what most AI users are used to. Scaling these agents to millions of businesses will be compute intensive, and, in our view, the resulting productivity gains will be well worth the investment.