2025 OCP Summit: Supernodes "Scale Up" is the focus of the event

Wallstreetcn
2025.10.22 03:03
portai
I'm PortAI, I can summarize articles.

Morgan Stanley believes that at the OCP Summit, from the ultra-wide Helios cabinet released by AMD, to the 800V DC power supply architecture aimed at disrupting the power supply landscape, and to Google's 2 megawatt liquid cooling unit, all technological breakthroughs revolve around a common goal - to build larger-scale, higher-power, and more efficient gigawatt-level AI data centers

Author: Bao Yilong

Source: Hard AI

The 2025 OCP Global Summit conveys a clear message: "Scale Up" architecture has become the core theme of AI data center infrastructure construction.

On October 20, Morgan Stanley's Asia-Pacific team released a research report, pointing out that to meet the endless demand for computing power from AI, the entire industry is racing towards larger scale and higher density "Scale Up" architecture.

The report indicates that investment focus needs to shift from general server components to core technology suppliers that can support supernode architecture. The conference clearly identified four major technological trends and key beneficiaries:

  • Larger Cabinets: The AMD Helios ultra-wide cabinet architecture was unveiled, promoting upgrades of components within the cabinet, with Wistron and Wiwynn being the main beneficiaries.
  • Higher Power: The 800V direct current (VDC) power supply solution has become the next-generation standard, which will disrupt the power distribution architecture of data centers, with Delta and BizLink in leading positions.
  • Stronger Cooling: The 2 megawatt (2MW) liquid cooling distribution unit (CDU) has become a focal point, with Google's Deschutes solution attracting significant attention.
  • Faster Networks: Ethernet (ESUN) and CPO switch technologies optimized for AI are emerging, with network equipment vendors like Accton poised for upgrade opportunities.

In summary, the entire industry is preparing for the upcoming gigawatt-level AI data center clusters in the coming years, and companies that can provide higher density and higher efficiency solutions will occupy a core position in the next growth phase.

Double-Wide Racks Open a New Era of Scale Up

"Scale Up" is aimed at achieving higher density single-node computing power, which is revolutionizing the form of cabinets.

AMD, in collaboration with Meta, Wiwynn, and other manufacturers, has launched the Helios cabinet. Its key feature is the adoption of the ORW (Open Rack Wide) specification, with a width that is twice that of the traditional ORV3 cabinet (21 inches).

Currently, the floating-point operation performance (FLOPs) density of high-performance chips is extremely high. To connect more computing cores in a low-latency environment, they must be placed within the same scale-up domain.

Under the current technical limitations of copper wire connections, this can only be achieved through larger backplanes or mid-plates, leading to the creation of larger cabinets.

Meta believes that decoupling must be achieved in the future. Although the power density of racks will continue to increase in the short term, it will ultimately decrease due to the application of optical technology, freeing itself from the limitations of copper interconnects.

The Helios rack is expected to start shipping in the second half of 2026, with major customers including Meta, Oracle, and OpenAI.

**According to supply chain investigations, Wiwynn is Meta's main ODM partner, while Wistron is the main ODM partner for GPU modules, substrates, and switch trays, with most PCBs requiring M9-level CCL materials **

At the same time, this ultra-wide heavy-duty cabinet places higher demands on mechanical components such as chassis and rails, benefiting suppliers like Chenbro and King Slide.

800V DC Power Architecture Leads the Next Generation of Efficient Gigawatt AI Factories

As cabinet power density skyrockets, traditional power supply architectures are becoming unsustainable. The 800V DC (VDC) power supply solution has become the focus, seen as a key technology driving the next generation of gigawatt-level AI factories.

Compared to the traditional 50V architecture, the 800V DC solution can transmit over 150% more power on copper cables of the same specifications and can improve power usage efficiency (PUE) by about 5%.

In terms of specific progress, Delta Electronics has showcased mature solutions, including a 1.2MW solid-state transformer (SST, in mass production, with designs for over 3MW underway), 800V electronic fuses (eFuse), a 90kW DC-DC power supply, and a 12kW distribution panel.

The new solution is expected to double the power supply value per watt compared to current designs. Power interconnect suppliers like BizLink will also benefit from the demand for more stringent specifications such as liquid-cooled busbars.

The report indicates that the 800V DC solution is expected to debut in the second half of 2027 with NVIDIA's Rubin Ultra platform.

Large-Scale Liquid Cooling Systems Become the Focus

Heat dissipation is the lifeline that determines whether computing power can be stably output. The technology path showcased at the conference is very clear, evolving from current hybrid cooling to full liquid cooling. Specifically:

Current Status of GB300: The GB300 computing tray, which has entered mass production, adopts a hybrid cooling solution (85% liquid cooling / 15% air cooling), with only 6 quick-disconnect (QD) connectors per computing tray. Yield rates are no longer a major concern in the market.

Prospects for VR200: The next-generation VR200 platform will be fully liquid-cooled, with the number of quick-disconnect connectors per computing tray increasing to 14. It has currently entered cabinet-level production and testing, with delivery expected by the end of the third quarter of 2026.

Large-Scale CDU: Google has open-sourced its 2 megawatt (MW) cooling distribution unit (CDU) design, supporting pressures of up to 80 PSI, enabling high-end cold plate designs. BOYD, Cooler Master, Delta Electronics, and Envic have all showcased related products.

The report cites Promersion's forecast, stating that while cold plate technology will remain the market mainstream until 2030, the inflection point for immersion liquid cooling is expected to occur in 2028.

Network Technology Continues to Optimize to Meet AI Demands

In addition to scaling up solutions within nodes, enhancing high-speed interconnects (Scale Out) between nodes is also key to leveraging AI cluster performance.

The report notes that Ethernet solutions (ESUN) and CPO switches launched to enhance network performance have been widely applied in optimizing AI data networks However, the reliability, maintainability, and cost issues of these products remain key factors affecting their widespread application. In terms of specific progress:

  • ZhiBang and TianHong both showcased the latest 1.6T network switch products based on Broadcom's Tomahawk 6 ASIC, which are expected to begin early applications by the end of 2026 or early 2027. ZhiBang also demonstrated a proof of concept for a CPO switch based on Tomahawk 6 ASIC and IRIS optical wavelength switching.
  • Research results released by Meta show that its 51.2T CPO (Co-Packaged Optics) switch has an annualized link failure rate (ALFR) of only 0.34%, significantly better than the 1.58% of pluggable optical modules, indicating a clear reliability advantage, but cost and maintainability remain key to its popularization.
  • Meanwhile, active electrical cables (AEC) are emerging as a cost-effective solution, with an increasing share in scale-out networks. Meta's GB300 cabinet utilizes AEC, and this trend is expected to continue benefiting suppliers like MaoLian.

In summary, the 2025 OCP Global Summit released an extremely clear signal. The arms race for AI infrastructure has entered the "giantization" stage, with scalable expansion becoming the core theme throughout the event