China is not only home to DeepSeek; Alibaba launched a major new model on New Year's Eve. Is it time to reassess China's AI assets as a whole?

Wallstreetcn
2025.01.29 04:05
portai
I'm PortAI, I can summarize articles.

When Alibaba Cloud demonstrates the combined advantages of "powerful models + sufficient computing power + complete cloud platform," does it confirm the investment logic similar to that of North American cloud service providers last year?

On Lunar New Year's Eve, as Chinese people around the world celebrate the arrival of the New Year, the electronic screen of the New York Stock Exchange saw a significant fluctuation in Alibaba's stock price before the market closed—rising rapidly from a 1% increase to 6.7%.

Behind this market movement is a smoke-free technical surprise attack.

In the early hours of January 29, Alibaba's Tongyi Qianwen team quietly launched the large model Qwen2.5-Max, which demonstrated performance on par with the world's top models in several authoritative benchmark tests.

Following DeepSeek, the release of Qwen2.5-Max marks another important breakthrough for China's AI camp in the high-performance, low-cost technology route.

Market analysts suggest that the previous excessive focus on DeepSeek overlooked the overall catch-up of Chinese AI, including Alibaba Tongyi. Industry media "Information Equality" stated that if Alibaba's Qwen-2.5-max indeed outperforms V3, there could be greater expectations for its RL reasoning model.

Furthermore, when Alibaba Cloud showcases the combination advantages of "powerful models + sufficient computing power + complete cloud platform," does it validate a similar investment logic as last year's North American cloud service providers? If U.S. stocks gain $10 trillion in overall AI value, is it time for the revaluation of Chinese AI assets?

Fully Benchmarking Global Top Models, Million Token Milestone

Qwen2.5-Max adopts a large-scale MoE (Mixture of Experts) architecture, based on over 20 trillion tokens of pre-training data.

In several authoritative assessments, including MMLU-Pro for testing university-level knowledge, LiveCodeBench for evaluating programming abilities, LiveBench for comprehensive capability assessment, and Arena-Hard for approximating human preferences, the model has demonstrated performance on par with or even surpassing DeepSeek V3, GPT-4, and Claude-3.5-Sonnet.

The Alibaba team stated that with continuous advancements in post-training technology, the next version is expected to reach even higher levels.

The Qwen2.5 team also released two innovative models: Qwen2.5-7b-instruct-1m and Qwen2.5-14b-instruct-1m. These open-source models support context windows of up to 1 million tokens, becoming the first publicly available models to reach this scale in the industry.

These models utilize sparse attention, focusing only on the most important parts of the context. This method processes million-token inputs 3 to 7 times faster than traditional methods, with output lengths of up to 8000 tokens. However, this requires the model to identify key paragraphs in the context document—a task that current language models often struggle to accomplish In testing, both the 14B model and Qwen2.5-Turbo achieved perfect accuracy in finding hidden numbers in very long documents. The smaller 7B model also performed well, with only slight errors.

In more demanding complex context tests such as RULER, LV-Eval, and LongbenchChat, the million-token model outperformed the 128K token model, especially in sequences exceeding 64K tokens, with the 14B model even scoring over 90 in RULER—this is the first time for the Qwen series—consistently beating GPT-4o mini across multiple datasets.

Is it time to reassess Chinese AI assets as a whole?

If the emergence of DeepSeek V3 showcased the sharpness of Chinese AI, then Alibaba's breakthrough reflects the deep evolution of the industrial ecosystem.

On the day of the Qwen2.5-Max release, Alibaba Cloud's Bailian platform simultaneously opened full toolchain support, allowing developers to directly call it in the cloud. This "supercomputing cluster + open-source ecosystem + cloud-native" triad architecture mirrors the business models of the three major cloud service providers in North America: AWS, Azure, and GCP.

Moreover, according to the latest research report from Morgan Stanley that we mentioned earlier, low-cost high-performance models will also reshape the landscape of data centers and the software industry:

  • For Chinese data centers, in the short term, if large tech companies adopt similar technological routes, it may reduce the demand related to AI training. However, in the long run, low-cost models will drive growth in inference demand, benefiting data centers in first-tier cities;
  • For the Chinese software industry, the reduction in AI model costs will lower the threshold for applications to run AI functions, improving the industry environment from the supply side.

If the performance of Alibaba's Qwen-2.5-max indeed demonstrates the expected level, coupled with its low-cost advantage and complete cloud ecosystem, it may trigger a new round of reassessment of Chinese AI assets following DeepSeek