Nvidia's AI accelerator H200./Courtesy of Nvidia

Nvidia's artificial intelligence (AI) accelerator H200 is failing to keep up with demand, pointing to a sharp supply crunch in the global AI infrastructure market. Orders for graphics processing units (GPUs) are expected to reach as many as 2 million this year, but immediately available supply is only around 700,000. If GPUs are not supplied on time, data center construction schedules could be disrupted, prompting big tech to look to application-specific integrated circuits (ASICs) as alternatives.

On the 5th, market research firm TrendForce said demand for in-house ASICs by cloud service corporations in this year's AI server market is expected to outpace the growth rate of GPU demand. The growth rate for ASICs used by cloud firms is forecast at 44.6%, higher than the 16.1% growth rate for GPUs. This is interpreted as a structural signal that GPU supply constraints are accelerating ASIC adoption.

The industry expects GPU supply chain risk—despite their versatility, high performance, and rapid AI infrastructure deployment—to peak this year. GPUs are intertwined across process, packaging, and high-bandwidth memory (HBM), so a bottleneck at any stage can block overall supply. In particular, demand for AI Semiconductor concentrated on Nvidia is surging, but the production capacity of Nvidia and TSMC, the world's largest foundry, is not able to handle even 50% of demand.

TSMC is expanding advanced packaging process lines essential for producing AI accelerators, such as CoWoS (chip on wafer on substrate). However, because capacity expansion investment takes time, a gap is inevitable this year between the rapidly increasing order volume and the quantity that can actually be shipped.

Against this backdrop, ASICs—led last year by Google's tensor processing unit (TPU), which ushered in a new trend—are drawing attention. Because ASICs are designed for specific AI workloads, initial development expense is high, but over the long term they are advantageous in power efficiency relative to performance and total cost of ownership (TCO). Google has already handled a significant portion of AI training and inference in-house through TPUs, and Amazon is also reshaping its cloud cost structure with dedicated chips such as Trainium and Inferentia.

Market research firm Mordor Intelligence said in a semiconductor market report that the ASIC institutional sector in the AI accelerator market is expected to post a compound annual growth rate (CAGR) of about 28% through 2030. Another market research firm, Credence Research, likewise projected that the Generative AI ASIC market size will grow from about $24.9 billion in 2024 to about $186.7 billion in 2032, forecasting an average annual growth rate of about 28.6%.

The industry sees this year as Will be a critical turning point for growth in the ASIC market. An Amazon Web Services (AWS) official said, "This supply shortage is a short-term phenomenon, but it leaves a long-term impact on decision-making," and added, "From big tech's perspective, GPUs are no longer a stable 'basic good,' but have become a strategic asset that can be shaken by external variables. Accordingly, scenarios that lower GPU dependence and raise the ASIC share in new data center investment plans are emerging as realistic alternatives."

※ This article has been translated by AI. Share your feedback here.