Morgan Stanley models "AI Inference Factory": Whether it's NVIDIA or Huawei chips, both can be profitable, with an average profit margin exceeding 50%

Wallstreetcn
2025.08.16 07:32
portai
I'm PortAI, I can summarize articles.

NVIDIA crowned, Google, Amazon, and Huawei "guaranteed profits," AMD unexpectedly lost. AI inference is not only a technological revolution but also a business that can be precisely calculated and offers substantial returns

AI inference is an incredibly profitable business.

Morgan Stanley's latest blockbuster report has, for the first time, calculated the economic returns of the global AI computing power competition through a precise financial model. The conclusion is: A standard "AI inference factory," regardless of which giant's chip is used, generally has an average profit margin exceeding 50%.

Among them, NVIDIA's GB200, with an impressive profit margin of nearly 78%, undoubtedly takes the crown, while Google's and Huawei's chips also "guarantee profits." However, AMD, which the market has high hopes for, recorded significant losses in AI scenarios.

Profit Ranking: A Tale of Two Extremes

Morgan Stanley's model calculations reflect the differentiated profitability of AI hardware giants in real commercial scenarios, composing a clear "song of ice and fire."

The flames belong to NVIDIA, Google, Amazon, and Huawei.

The report shows that an "AI factory" using NVIDIA's flagship product GB200 NVL72 achieves a terrifying 77.6% profitability, far ahead of the competition. This is not only due to its unparalleled computing, memory, and networking performance but also benefits from continuous innovations in areas like FP4 precision and the strong barriers of the CUDA software ecosystem, demonstrating absolute market dominance.

Google's self-developed TPU v6e pod follows closely with a profit margin of 74.9%, proving that top cloud providers can indeed build highly economical AI infrastructure through hardware-software collaborative optimization.

Similarly, AWS's Trn2 UltraServer achieved a profit margin of 62.5%, and Huawei's Ascend CloudMatrix 384 platform also recorded a profit margin of 47.9%.

The ice water, however, unexpectedly splashed onto AMD.

The most disruptive conclusion of the report is AMD's financial performance in inference scenarios. Morgan Stanley's calculations show that the "AI factory" using its MI300X and MI355X platforms has profit margins of -28.2% and -64.0%, respectively.

The core reason for the losses lies in the severe imbalance between high costs and output efficiency. The report indicates that the annual total cost of ownership (TCO) for an MI300X platform reaches as high as $774 million, on par with NVIDIA's GB200 platform at $806 million This means that the initial investment and ongoing expenses for operating the AMD solution are top-notch, but the revenue generated from its token output efficiency in the inference tasks, which simulate 85% of the future AI market share, is far from covering its high costs.

“100MW AI Factory Model”: Modeling AI Factories, Quantifying Investment Returns

Supporting the above conclusion is a standardized analytical framework pioneered by Morgan Stanley—the “100MW AI Factory Model.” It quantitatively evaluates AI solutions across different technological paths under the same business dimension, with its core based on three pillars:

1. Standardized “Compute Unit”: The model uses 100 megawatts (MW) of power consumption as the benchmark unit for the “AI factory.” This is a typical power consumption for a medium-sized data center, sufficient to drive approximately 750 high-density AI server racks.

2. Detailed “Cost Ledger”: The model comprehensively accounts for the total cost of ownership (TCO), primarily including:

  • Infrastructure Costs: Approximately $660 million in capital expenditure for every 100MW, used for building data centers and supporting power facilities, depreciated over 10 years.

  • Hardware Costs: Totaling up to $367 million to $2.273 billion for server systems (including AI chips), depreciated over 4 years.

  • Operating Costs: Ongoing electricity costs calculated based on power usage efficiency (PUE) from different cooling solutions and global average electricity prices.

In summary, the annual average TCO for a 100MW “AI factory” ranges from $330 million to $807 million.

3. Market-Oriented “Revenue Formula”: Revenue is directly linked to token output. The model calculates TPS (tokens processed per second) based on publicly available performance data of various hardware and sets a fair price of $0.20 per million tokens, referencing mainstream API pricing from OpenAI, Gemini, etc. Additionally, considering a 70% equipment utilization rate in reality makes the revenue forecast more aligned with commercial realities.

Future Battlefield: Ecological Competition and Product Roadmap

Behind profitability lies a deeper strategic game. The report reveals that the future AI battlefield will focus on building technological ecosystems and laying out next-generation products.

In the non-NVIDIA camp, a war over “connection standards” has already begun. Manufacturers led by AMD are promoting UALink, emphasizing that its strict specifications for low latency are crucial for AI performance; while forces represented by Broadcom advocate for a more open and flexible Ethernet solution. The outcome of this debate will determine who can establish an open ecosystem that can compete with NVIDIA's NVLink At the same time, NVIDIA is consolidating its leading position with a clear roadmap. The report mentions that its next-generation platform "Rubin" is progressing as planned, with mass production expected to begin in the second quarter of 2026, and related servers set to ramp up in the third quarter of the same year. This undoubtedly sets a continuously moving and higher target for all competitors to chase.

In summary, this report from Morgan Stanley injects a dose of "business rationality" into the fervent AI market. It eloquently demonstrates that AI reasoning is not only a technological revolution but also a business that can be precisely calculated and offers substantial returns.

For global decision-makers and investors, the two profit charts mentioned earlier will provide considerable reference value for computing power investments in the AI era