Google is challenging Nvidia's dominance by expanding the supply of its own AI chips specialized for AI inference.
Google Cloud said on the 6th (local time) that it will officially launch the 7th-generation tensor processing unit (TPU) "Ironwood," unveiled in Apr., within a few weeks. Ironwood delivers four times the performance of last year's 6th-generation "Trillium" and improves performance by up to more than tenfold over the 5th-generation TPU v5p released in 2023.
The TPU that Google designed and built in-house is a chip optimized for AI Deep Learning. Google said up to 9,216 Ironwood chips can be connected in a single system, which helps eliminate data processing bottlenecks.
The Ironwood TPU is especially optimized for everything from training large-scale models that require tensor operations to complex Reinforcement Learning (RL) and large-scale, low-latency AI inference. Google named this chip a TPU to convey that it is more specialized for tensor operations than general-purpose graphics processing units (GPU) or Neural Processing Unit (NPU).
Google is currently engaged in fierce competition with big tech companies such as Microsoft (MS), Amazon, and Meta to build next-generation AI infrastructure. Until now, most large language models and AI workloads have relied on Nvidia's GPUs. Google stresses that its TPU falls into the category of custom silicon and has advantages in price, performance, and efficiency, and is working to instill the perception that it can serve as a replacement for Nvidia GPUs.
At the same time, Google said it is already receiving positive feedback on Ironwood from major customers. Notably, Anthropic, which runs the AI chatbot "Claude," has agreed to receive up to 1 million TPUs, and Lightricks and Essential AI are also using Ironwood.
Chief Executive Officer Sundar Pichai also said on a conference call after the recent earnings release that "demand for TPU- and GPU-based AI infrastructure is significant."