
Morning news! Alibaba releases and open-sources Qwen3, seamlessly integrating thinking modes, multilingual capabilities, and facilitating Agent calls

Alibaba stated that Qwen3 seamlessly integrates two thinking modes, supports 119 languages, and facilitates Agent invocation. The released Qwen3 series includes two expert mixture (MoE) models and an additional six models, among which the flagship model Qwen3-235B-A22B demonstrates highly competitive performance in benchmarks related to code, mathematics, and general capabilities compared to top models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro
On Monday, Alibaba released and open-sourced the Tongyi Qianwen 3.0 (Qwen3) series models, claiming that they can compete with DeepSeek in various aspects such as mathematics and programming. Compared to other mainstream models, Qwen3 also significantly reduces deployment costs. Alibaba stated that Qwen3 seamlessly integrates two modes of thinking and supports 119 languages, making it convenient for Agent calls.
Performance Comparable to DeepSeek R1, OpenAI o1, Fully Open Source
The Qwen3 series includes two Mixture-of-Experts (MoE) models and six additional models. Alibaba stated that the latest flagship model Qwen3-235B-A22B shows highly competitive performance in benchmark tests for code, mathematics, and general capabilities compared to top models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro.
Additionally, the Qwen3-30B-A3B model, known as the "Mixture-of-Experts" (MoE), has 10% of the activation parameters of QwQ-32B, performing even better, and even smaller models like Qwen3-4B can match the performance of Qwen2.5-72B-Instruct. These systems simulate the human problem-solving thought process by dividing tasks into smaller datasets, similar to having a group of experts with different strengths responsible for different parts, thereby enhancing overall efficiency.
At the same time, Alibaba also open-sourced the weights of two MoE models: Qwen3-235B-A22B, which has over 235 billion total parameters and over 22 billion activation parameters, and the smaller MoE model Qwen3-30B-A3B, which has about 30 billion total parameters and 3 billion activation parameters. Additionally, six Dense models have also been open-sourced, including Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B, all open-sourced under the Apache 2.0 license.
"Hybrid" Model, Two Thinking Modes
Alibaba stated that the Qwen 3 series is a "hybrid" model that can spend time "reasoning" to solve complex problems and quickly answer simple requests, referred to as "thinking mode" and "non-thinking mode." The reasoning ability in the "thinking mode" allows the model to effectively perform self-fact-checking, similar to OpenAI's o3 model, but at the cost of higher latency during the reasoning process.
The Qwen team wrote in a blog post:
This flexibility allows users to control the extent to which the model "thinks" based on specific tasks. For example, complex questions can be solved by extending reasoning steps, while simple questions can be answered directly and quickly without delay.
Crucially, the combination of these two modes greatly enhances the model's ability to achieve stable and efficient "thinking budget" control. As mentioned above, Qwen3 demonstrates scalable and smooth performance improvements, which are directly related to the allocated computational reasoning budget.
This design allows users to more easily configure specific budgets for different tasks, achieving a better balance between cost-effectiveness and reasoning quality.
Training Data Volume is Twice that of Qwen2.5, Facilitating Agent Calls
Alibaba stated that the Qwen3 series supports 119 languages and is trained on nearly 36 trillion tokens, with the data volume being twice that of Qwen2.5. Tokens are the basic data units processed by the model, with approximately 1 million tokens equivalent to 750,000 English words. Alibaba claims that the training data for Qwen3 includes various content such as textbooks, Q&A pairs, and code snippets.
It is reported that the pre-training process of Qwen3 is divided into three stages. In the first stage (S1), the model was pre-trained on over 30 trillion tokens, with a context length of 4K tokens. This stage provided the model with basic language skills and general knowledge.
In the second stage (S2), the training improved the dataset by increasing the proportion of knowledge-intensive data (such as STEM, programming, and reasoning tasks), followed by additional pre-training on 5 trillion tokens. In the final stage, high-quality long-context data was used to extend the context length to 32K tokens, ensuring that the model can effectively handle longer inputs.
Alibaba stated that due to improvements in model architecture, an increase in training data, and more effective training methods, the overall performance of the Qwen3 Dense base model is comparable to that of the Qwen2.5 base model with more parameters. For example, Qwen3-1.7B/4B/8B/14B/32B-Base performs similarly to Qwen2.5-3B/7B/14B/32B/72B-Base. Particularly in areas such as STEM, coding, and reasoning, the Qwen3 Dense base model even outperforms the larger Qwen2.5 models. For the Qwen3 MoE base model, it achieved performance similar to that of the Qwen2.5 Dense base model while using only 10% of the activated parameters, significantly saving on training and inference costs.
In the post-training phase, Alibaba fine-tuned the model using diverse long-chain reasoning data covering various tasks and fields, including mathematics, coding, logical reasoning, and STEM problems, equipping the model with basic reasoning capabilities. Then, through large-scale reinforcement learning, it enhanced the model's exploration and research capabilities using rule-based rewards.
Alibaba stated that Qwen3 excels in capabilities such as tool-calling, executing instructions, and replicating specific data formats, recommending users to utilize Qwen-Agent to fully leverage Qwen3's agent capabilities. Qwen-Agent internally encapsulates tool-calling templates and tool-calling parsers, significantly reducing code complexity.
In addition to providing a downloadable version, Qwen3 can also be accessed through cloud service providers such as Fireworks AI and Hyperbolic.
The Goal Remains Focused on AGI
Recently, OpenAI, Google, and Anthropic have also launched several new models. OpenAI recently announced plans to release a more "open" model in the coming months that mimics human reasoning, marking a shift in its strategy, as DeepSeek and Alibaba have already taken the lead in launching open-source AI systems.
Currently, Alibaba is building its AI landscape with Qwen at its core. In February of this year, CEO Eddie Wu stated that the company's "top priority" is to achieve Artificial General Intelligence (AGI)—that is, to create an AI system with human-level intelligence.
Alibaba stated that Qwen3 represents an important milestone in the company's journey toward Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI). Looking ahead, Alibaba plans to enhance the model from multiple dimensions, including optimizing model architecture and training methods to achieve several key goals: expanding data scale, increasing model size, extending context length, broadening modality range, and utilizing environmental feedback to advance reinforcement learning for long-cycle reasoning.
Excitement in the Open Source Community
The release of Alibaba's Qwen3 has excited the AI community, with netizens sharing classic memes:
Some netizens said,
In my tests, the performance of 235B in high-dimensional tensor operations is comparable to Sonnet.
This is an outstanding model,
Thank you all.
Some netizens praised Qwen3:
If I hadn't seen the tokens generated in real-time on the screen, I wouldn't have believed those benchmark results.???? It's simply like magic????
Supporters of open-source AI are even more excited. One netizen said:
"With an open-source 32B large model, the performance is on par with Gemini 2.5 Pro."
"We are back with a vengeance!"
Netizens thanked Alibaba for actively promoting open-source: