At the AWS re:Invent 2025 event on the 3rd, Amazon Web Services (AWS) announced the launch of the Amazon EC2 Trn3 UltraServer based on the Trainium3 chip.
The company said the server meets the surging computing demands of AI systems and helps customers run AI applications more efficiently and economically. In particular, the Trainium3 chip, built on a 3-nanometer (nm) process, can train larger AI models faster and serve more users at lower expense.
The Trn3 UltraServer can scale up to 144 Trainium3 chips, delivers up to 362 FP8 petaflops (PFLOPS) of performance, and achieves a 4x reduction in latency, marking performance gains over the previous generation. This shortens model training time from months to weeks and, at the same time, handles more inference requests to reduce time to market and operating expense.
In tests using OpenAI's GPT-OSS model, it achieved 3x higher per-chip throughput and 4x faster response time compared with the Trn2 UltraServer, showing that corporations can scale AI applications with less infrastructure while improving user experience.
AWS has begun developing Trainium4 to support next-generation frontier training and inference. Trainium4 targets at least 6x FP4 compute performance, 3x FP8 performance, and 4x memory bandwidth, among other improvements. In particular, Trainium4 is designed to support NVIDIA NVLink Fusion high-speed chip interconnect technology.
The company said, "Ongoing hardware and software optimizations will lay the foundation to train AI models even faster and process more inference requests, and are expected to deliver a flexible, high-performance platform optimized for high-load AI model training and inference workloads."