
Is Deep Seek R2 coming?
In the country competition centered on technology, AI competition has become one of the underlying anchors for country asset valuation. Therefore, the release of R2 is very important.
Recently, MS released a report indicating that R2 should be coming soon. Here are some key points summarized for everyone:
a. Double the parameters: The R2 model may have 1.2 trillion parameters, nearly double that of R1 (R1—671 billion, upgraded version DeepSeek-R1-0528 is 685 billion); among them, 78 billion are active parameters, using MOE architecture.
b. Unbeatable cost-effectiveness: The input cost per million tokens is $0.07, compared to R1's $0.15-16, and the output is $0.27, compared to R1's $2.19.
c. Low hardware requirements: R2 is trained on Huawei's Ascend 910B, while R1 was on NVIDIA's H100.
Model performance:
1. Multilingual: Previously, R1 mainly focused on English reasoning. R2 can handle multiple languages.
2. Multimodal: Not only text, but also capable of processing images, voice, and video data.
3. More extensive reinforcement learning: Using a larger database, the model's logic is stronger, and reasoning is more like a human.
4. Resource investment in the inference stage: Instead of the training stage using a General Reward Model (GRM), by increasing computing resources during the model inference stage rather than the training stage, improving the quality of model output.
$NVIDIA(NVDA.US) $Alibaba(BABA.US)
The copyright of this article belongs to the original author/organization.
The views expressed herein are solely those of the author and do not reflect the stance of the platform. The content is intended for investment reference purposes only and shall not be considered as investment advice. Please contact us if you have any questions or suggestions regarding the content services provided by the platform.