Is the new model from DeepSeek here?

Wallstreetcn
2026.02.11 11:18
portai
I'm PortAI, I can summarize articles.

DeepSeek is currently conducting a gray test of its next-generation model. Some users received a prompt for an updated version after opening the app, with the new version's context length expanded from 128K to 1M, and the knowledge base updated to May 2025. The official app indicates that this may be the ultimate gray version before the official launch of V4. A report from Nomura Securities pointed out that the core value of V4 lies in driving the commercialization of AI applications through innovations in the underlying architecture, rather than disrupting the existing AI value chain

DeepSeek is advancing the gray testing of its new version model, which may be the ultimate gray version before the official debut of V4.

On February 11, some users received a prompt to update the version after opening the DeepSeek App. After the app update (1.7.4), users can experience the latest model of DeepSeek. After this upgrade, the model's context length will expand from 128K to 1M, nearly a tenfold increase; the knowledge base will be updated to May 2025, with substantial improvements in several core capabilities.

The author’s tests found that DeepSeek stated in Q&A that the current version is likely not V4, but very likely the final evolutionary form of the V3 series, or the ultimate gray version before the official debut of V4.

Nomura Securities released a report on February 10 stating that the DeepSeek V4 model, expected to launch in mid-February 2026, will not replicate the global AI computing power demand panic triggered by the release of V3 last year. The firm believes that the core value of V4 lies in driving the commercialization of AI applications through underlying architectural innovation, rather than disrupting the existing AI value chain.

According to evaluations, the new version has aligned its complex task processing capabilities with mainstream closed-source models such as Gemini 3 Pro and K2.5. Nomura further pointed out that V4 is expected to introduce two innovative technologies, mHC and Engram, to break through the computing power chip and memory bottlenecks from both algorithmic and engineering perspectives. Preliminary internal tests show that V4's performance in programming tasks has surpassed that of contemporaneous models like Anthropic Claude and OpenAI GPT series.

The key significance of this release lies in further compressing training and inference costs, providing a feasible path to alleviate capital expenditure pressures for global large language models and AI application companies.

Innovative Architecture Optimizing Hardware Bottlenecks

Nomura Securities' report pointed out that the performance of computing power chips and HBM memory bottlenecks has always been an unavoidable hard constraint for the domestic large model industry. The upcoming DeepSeek V4 introduces mHC (Hyper-Connected and Manifold Constraint Hyper-Connection) and Engram architecture, which systematically optimizes the aforementioned shortcomings from both training and inference dimensions.

mHC:

  • Full name is "Manifold Constraint Hyper-Connection." It aims to address the bottlenecks in information flow and training instability when the Transformer model has an extremely deep layer count.

  • In simple terms, it makes the "dialogue" between neural network layers richer and more flexible, while preventing information from being amplified or destroyed through strict mathematical "guardrails." Experiments have shown that models using mHC perform better in tasks such as mathematical reasoning.

Engram:

  • A "conditional memory" module. Its design concept is to decouple "memory" from "computation."

  • Static knowledge in the model (such as entities and fixed expressions) is specifically stored in a sparse memory table, which can be placed in inexpensive DRAM. When inference is needed, it is quickly retrieved. This frees up expensive GPU memory (HBM), allowing it to focus on dynamic computation.

The mHC technology mitigates the generational gap in interconnect bandwidth and computational density of domestic chips to some extent by improving training stability and convergence efficiency; while the Engram architecture aims to reconstruct the memory scheduling mechanism, breaking through the limitations of video memory capacity and bandwidth with more efficient access strategies in the context of limited HBM supply. Nomura believes that these two innovations together constitute an adaptation plan for the domestic hardware ecosystem, with clear engineering implementation value.

The report further points out that the most direct commercial impact of the V4 release is a substantial reduction in training and inference costs. Optimizations on the cost side will effectively stimulate downstream application demand, thereby giving rise to a new cycle of AI infrastructure construction. In this process, Chinese AI hardware manufacturers are expected to benefit from the dual pull of increased demand and upfront investment.

Market Landscape Shifts from "Dominance" to "Fragmentation"

Nomura's report reviews the changes in the market landscape one year after the release of DeepSeek-V3/R1. By the end of 2024, the two models of DeepSeek accounted for more than half of the token usage of open-source models on OpenRouter.

However, by the second half of 2025, as more players entered the market, its market share had significantly declined. The market has shifted from "dominance" to "fragmentation." The competitive environment faced by V4 is far more complex than it was a year ago. The "computational management efficiency" combined with "performance improvements" of DeepSeek has accelerated the development of large language models and applications in China, also changing the global competitive landscape and increasing attention to open-source models.

Software Companies Welcome Opportunities for Value Enhancement

Nomura believes that major global cloud service providers are fully pursuing general artificial intelligence, and the capital expenditure race is far from over, so V4 is not expected to cause the same level of shockwaves in the global AI infrastructure market as last year. However, global large model and application developers are burdened with an increasingly heavy capital expenditure load. If V4 can significantly reduce training and inference costs while maintaining high performance, it will help these companies convert technology into revenue more quickly and alleviate profit pressure.

On the application side, a more powerful and efficient V4 will give rise to more robust AI agents. The report observes that applications like Alibaba Tongyi Qianwen App are already able to execute multi-step tasks in a more automated manner, with AI agents transforming from "dialogue tools" to "AI assistants" capable of handling complex tasks.

These multi-tasking agents will need to interact more frequently with the underlying large models, consuming more tokens and thereby increasing computing power demand. Therefore, the improvement in model efficiency will not "kill software," but rather create value for leading software companies. Nomura emphasizes the need to focus on software companies that can leverage the capabilities of the next generation of large models to create disruptive AI-native applications or agents. Their growth ceiling may be raised again due to the leap in model capabilities