Jensen Huang, CEO of NVIDIA, introduces the next-generation AI server Vera Rubin NVL144 at GTC 2025 in San Jose, California, in March. /Courtesy of NVIDIA

Nvidia, which leads the artificial intelligence (AI) semiconductor market, has fully unveiled its next-generation AI chip, Rubin, slated for release next year. Analysts are saying that competitors will find it even harder to catch up. It is not simply because the chip's performance has been boosted. What startled the industry is that, instead of merely improving how existing memory is used to fix inefficiencies in how AI works, Nvidia introduced a completely new system.

Until now, the industry's rule of thumb was that top-tier AI accelerator performance was determined by high bandwidth memory (HBM). But Nvidia opted for a disruptive approach. For its top-tier AI platform, it did not rely solely on HBM and, for the first time, attempted a strategic combination of HBM and GDDR7. HBM is memory specialized for supplying large amounts of data at once by making the data lanes (bandwidth) extremely wide, while GDDR7 is high-speed memory that narrows the lanes relatively but drives data transfer rates to the extreme.

◇ Stop wasting HBM… the solution is division of labor

Nvidia judged that using only HBM for AI inference leads to significant waste. If you compare the AI response process to a restaurant kitchen, it splits into the stage of prepping ingredients in bulk (prefill) and the core stage of laying out the prepped ingredients and delicately finishing the final dish (decode). Until now, AI chips had the star chef (HBM) handle everything from bulk prep to final cooking. The efficiency relative to expense was bound to suffer.

Nvidia chose division of labor as the solution. Just as a chef refrains from doing ingredient prep and leaves it to a specialized kitchen team, the prefill stage, where fast computation matters, is handled exclusively by Rubin CPX, a GDDR7 chip specialized for speed. And the final cooking—decode, which continuously looks at the transfer results to complete the next step—is handled by Rubin R200 equipped with HBM4. In this way, from the outset, two specialized chips were designed as a team to handle a single AI inference task.

By the numbers, Nvidia achieved dramatic results with its strategic shift. According to analysis by semiconductor market research firm SemiAnalysis, GDDR7 is a half-price memory that is more than 50% cheaper per gigabyte (GB) than HBM. By using this chip, Nvidia cut Rubin CPX's total memory expense to one-fifth that of Rubin R200 while maintaining compute performance at about 60% of the R200. This goes beyond the chip level and translates into a breakthrough where data centers can deliver more AI services at lower cost.

Graphic = Son Min-gyun /Courtesy of Son Min-gyun

Industry watchers say the competitive landscape of the AI chip market itself has changed. Until now, rivals including AMD focused on developing products to take on Nvidia's top accelerators. But Nvidia shifted the competitive frame from one-on-one bouts to two-person team matchups. SemiAnalysis said the gap with competitors has grown to canyon-sized, and rivals will have to redraw their roadmaps from a blank slate.

◇ At Nvidia's request… mixed fortunes for Samsung Electronics, SK hynix, and Micron

Nvidia's new strategy is shaking up the memory semiconductor market as well. The arrival of Rubin CPX has sparked demand for GDDR7 in the data center market. Until now, GDDR memory was limited to PC graphics cards, but now a higher value-added market has opened. Samsung Electronics, SK hynix, and Micron are competing in this space. Based on its past experience reliably supplying Nvidia's large GDDR7 orders, Samsung Electronics is expected to hold a favorable position.

The next-generation HBM, HBM4, arms race triggered by the main chip Rubin R200 is intensifying further. According to industry sources, Nvidia recently raised HBM4's target data transfer rate to about 10–11 Gbps (10–11 gigabits per second), roughly 20% above the industry standard. That is fast enough to transfer dozens of 4K high-definition movies in just one second. The first to announce it had met the tougher standard was SK hynix. On the 12th, SK hynix officially said it had completed HBM4 development and established a mass-production system to supply customers. Not to be outdone, Samsung Electronics also said it had already secured an HBM4 mass-production system.

A semiconductor industry source said that while SK hynix has gained the upper hand in the initial supply race, Micron, which stood out in the HBM3E market, is failing to meet Nvidia's speed criteria in the HBM4 generation, and added that Samsung Electronics is expected to address some issues to meet Nvidia's requirements and then seek final quality approval. The industry expects Samsung Electronics, which has secured high yields on the latest processes and logic-die technology one generation ahead (the bottommost chip that serves as HBM's brain), to aim for a comeback over the medium to long term. Micron, due to technical limits in its base-die design approach (the foundational stacking design for HBM), is seen delaying some HBM investments and instead focusing more on DDR5-based server DRAM.

Will Nvidia be able to keep the top spot again through strategic innovation this time? Regardless of the outcome, the AI industry's landscape is already shifting rapidly.

※ This article has been translated by AI. Share your feedback here.