After Google recently unveiled its tensor processing unit (TPU), Amazon Web Services (AWS), the world's largest cloud corporations, also introduced its own custom artificial intelligence (AI) chip, "Trainium3." The aim is to reduce reliance on Nvidia's graphics processing units (GPUs) and increase the share of its own hardware. However, unlike Google, which objectively presented performance figures for its in-house TPU, AWS hid all key information, and the consensus is that it still falls short of GPUs in absolute performance metrics and ecosystem compatibility.
AWS said on the 2nd (local time) at its annual cloud computing conference re:Invent 2025 in Las Vegas that it will launch its in-house chip Trainium3, which boosts computing performance while reducing power consumption. AWS stressed that the product delivers more than four times the computing performance of its predecessor, Trainium2, while cutting energy consumption by about 40%.
Still, AWS did not disclose specific flops (FLOPS/floating point operations per second) figures, large language model (LLM) benchmarks, or comparisons with Nvidia's flagship GPU lineup (H100/H200/GB200), drawing criticism that it was a "half-baked" announcement. That contrasts with Google, which recently unveiled its TPU and detailed performance, energy efficiency, and speed using LLMs trained on the device.
AWS said the new product is four times faster than the transfer generation Trainium2 and can reduce operating expense by up to 50%, but because it is a comparison with its own product, it is unclear whether it can be competitive in today's AI chip market. Moreover, given that the transfer generation also lagged significantly in performance compared to GPUs, the analysis is that the new product will likewise be hard-pressed to replace GPUs.
The biggest hurdle is whether synchronization and data communication technologies that bind together chip clusters, essential for large-scale AI training, are feasible. AWS said at the event that Trainium3 can form chip clusters of up to 100,000 units, but it did not present measures to resolve bottlenecks that occur in such large-scale clusters. AWS also said performance per watt improved over the transfer generation, but it did not disclose comparisons with Nvidia GPUs and Google TPUs, leaving a question mark over absolute performance.
AWS also hid objective figures on key points such as training performance, inference performance, and latency improvements. Unlike Google's TPU, which recently disclosed specific chip performance, AWS did not reveal LLM inference performance or latency. The power-efficiency improvements AWS emphasized also lacked indicators of where it stands relative to GPUs and TPUs, and there were no real-world data on thermal management (thermal throttling), which accounts for a large share of data center operating expense.
There is also criticism that it falls short of Google's TPU in terms of optimization for its own services. Replacing GPUs with AWS's own chips could lead to degraded AI service quality in large-scale training models. Unlike Google's TPU, which is optimized for its in-house AI model Gemini and has achieved industry-leading training performance, AWS's Trainium series is known to be cheaper but significantly slower in training speed.
A cloud industry official said, "The compute capability of Trainium3 unveiled by AWS this time is estimated to be lower than Nvidia's transfer generation H100, and while its training speed is relatively high thanks to a design specialized for the cloud, it falls well short compared to Nvidia Blackwell," adding, "The product may replace GPUs in some data centers, but only in terms of expense savings, and it will still be difficult to replace GPUs in areas that require high performance."