
NVIDIA competes in the "Inference Era," new Rubin CPX GPU "USD 100 million investment, USD 5 billion inference revenue, 50 times return rate"!

Citi stated that NVIDIA has launched the Rubin CPX GPU specifically designed for long-context reasoning, promising customers a 50 times return on investment, far exceeding the 10 times return of the GB200 NVL72. The chip's performance in attention mechanisms is 3 times better than the GB300 NVL72. The NVIDIA GB300 NVL72 system has set a new record in the MLPerf inference benchmark test. These releases indicate that the era of inference has arrived
Author: Dong Jing
Source: Hard AI
NVIDIA has launched the new Rubin CPX GPU designed specifically for long-context reasoning, promising unprecedented returns on investment for customers!
On September 9, Citigroup stated in its latest research report that NVIDIA introduced the new Rubin CPX GPU designed for long-context reasoning at the AI Infrastructure Summit, promising customers unprecedented returns on investment—for every $100 million invested, $5 billion in reasoning revenue can be generated, achieving an investment return of approximately 50 times, far exceeding the approximately 10 times return of the GB200 NVL72.
Ian Buck, NVIDIA's Vice President of Large-Scale and High-Performance Computing, reiterated at the AI Infrastructure Summit that the company is committed to accelerating the adoption of generative AI through GPU-driven data centers. The newly released Rubin CPX is specifically designed for the highest performance in ultra-large-scale context processing, with up to a 3-fold performance improvement in attention mechanisms compared to the GB300 NVL72 system.
In addition to the new GPU release, NVIDIA also announced that its GB300 NVL72 rack-level system set a new reasoning benchmark record in the latest MLPerf inference benchmark test. According to Citigroup research, by inserting the Rubin CPX into its product roadmap, NVIDIA is accelerating its annual product release cadence in the context of intensified ASIC competition, marking the arrival of the "reasoning era."
Revolutionary Rubin CPX: A Profit Engine Built for the Reasoning Era
The NVIDIA Rubin CPX represents a new category of GPU design, optimized specifically for long-context reasoning. This chip can handle software programming and generative video at the million-token level, achieving breakthrough improvements in speed and efficiency.
Citigroup pointed out in its research report that the most notable aspect is its economic benefits:
The Rubin CPX has up to a 3-fold improvement in attention processing capability compared to the NVIDIA GB300 NVL72 system. More importantly, this chip works closely with the NVIDIA Vera CPU and Rubin GPU to form the new NVIDIA Vera Rubin NVL144 CPX platform, enabling enterprises to monetize investments at an unprecedented scale—for every $100 million invested, $5 billion in reasoning revenue can be generated, with an investment return of approximately 50 times, far exceeding the approximately 10 times return of the GB200 NVL72.
Citigroup stated that NVIDIA is accelerating its annual update cadence by inserting the Rubin CPX into its product roadmap, a move clearly aimed at responding to the increasingly fierce ASIC competition.
The NVIDIA GB300 NVL72 rack-level system set a new reasoning benchmark record in the latest MLPerf inference benchmark test, providing up to 1.4 times the DeepSeek-R1 inference throughput compared to the GB200 NVL72 system The platform also set performance records in all data center benchmarks newly added in the MLPerf Inference v5.1 suite, including DeepSeek-R1, Llama 3.1 405B Interactive, Llama 3.1 8B, and Whisper.
Citigroup stated that these system-level performance results represent a further enhancement based on NVIDIA's already established single GPU record in the MLPerf data center benchmarks.
Citigroup analysts pointed out that these releases indicate the arrival of the inference era, as reflected in Google's recent statement that the number of tokens processed in its inference has increased by more than 50 times year-on-year. NVIDIA is accelerating the adoption of generative AI through GPU-driven data centers, and this strategic positioning allows the company to fully capture the explosive growth opportunities in the inference market.
Citigroup Research maintains a "Buy" rating on NVIDIA, with a target price of $200, based on a 30 times price-to-earnings ratio of expected earnings per share for the fiscal year 2026. Analysts noted that the 30 times price-to-earnings ratio is consistent with the company's average level over the past 3-5 years, with an expected stock price return of 17.1%.
Analysts believe that NVIDIA demonstrates its ongoing innovation capability in the AI infrastructure field by releasing Rubin CPX and creating new MLPerf records. With the rapid growth of AI inference demand, especially the explosion of long-context inference demand, NVIDIA's new product portfolio will bring significant revenue growth opportunities for the company.