
Llama 4 is finally coming this month

Meta plans to release the highly anticipated AI model Llama4 later this month, although the release may be delayed again. The reasons for the delay include that the technical performance has not met expectations, especially in reasoning and mathematical tasks. To enhance competitiveness, Meta will adopt a "Mixture of Experts" (MoE) architecture and actively plan new strategies to bring Llama to the enterprise market, possibly offering a self-operated API
Last spring, Meta's Llama3 large language model received unanimous praise from developers and independent reviewers. However, nearly a year later, the highly anticipated successor Llama4 has yet to be released.
According to insiders, after at least two delays, Meta plans to release this much-anticipated AI model later this month—but this is not set in stone and could be postponed again.
Behind the Release Delay: Technical Performance Did Not Meet Expectations
According to two individuals familiar with the situation, one of the key reasons for the delay is that the model's performance in technical benchmark tests during development did not fully meet Meta's internal expectations, particularly in reasoning and mathematical tasks.
Additionally, internally, Meta is also concerned that its model cannot compete with OpenAI's top-level performance in simulating human conversational speech.
Technical Shift and New Commercial Exploration: Embracing MoE and LlamaX Plan
To enhance the competitiveness of Llama4, Meta plans to make significant adjustments to its technical roadmap.
Media reports, citing two insiders, indicate that at least one version of Llama4 is expected to adopt a "Mixture of Experts" (MoE) architecture, rather than the "Dense" model that the Llama series has consistently adhered to.
The MoE approach divides the model into multiple "expert" sub-networks focused on specific tasks, activating only the relevant parts when processing user requests, which is expected to improve performance while enhancing operational efficiency. DeepSeek and several other leading model developers have already adopted this technical route.
It is noteworthy that the decision within Meta regarding whether to shift from Dense to MoE underwent a fierce technical debate lasting over a year. The final choice to embrace MoE was undoubtedly influenced by the successful practices of competitors like DeepSeek.
On the commercial front, Meta is actively planning new strategies to more effectively bring Llama to the enterprise market.
Media reports, citing three insiders, indicate that the company is deeply discussing a model for Meta to provide self-operated APIs. However, it is still unclear whether this API will rely on Meta's own data centers or run on rented cloud service provider servers.
By offering self-operated APIs, Meta may be able to emulate OpenAI's model, providing customers with value-added services such as early access to models and customized technical support.
These discussions are part of an internal project codenamed "Llama X," which originates from Chief Strategy Officer David Wehner's team. Through Llama X, Meta also hopes to recruit engineers, marketers, and sales personnel to expand the enterprise application scope of Llama
Organizational Restructuring and Product Dilemma
In order to accelerate its development pace, Meta adjusted the technical leadership of its generative AI team in February this year. The team appointed Loredana Crisan, then head of Messenger, to lead product management for AI products, and replaced engineering heads Ryan Cairns and Ning Li. Subsequently, Meta appointed former Vice President of Mixed Reality Technology Amir Frenkel as the engineering head of the team.
Ahmad Al-Dahle, head of the AI department, stated internally that these changes would enable the team to "act faster and more effectively" and help "work as a team." Notably, Meta's generative AI team has rapidly expanded from about 500 people to over 1,700 in the past year and a half, even as CEO Mark Zuckerberg has been cutting company costs and personnel in recent years.
However, progress on the product front has not been smooth. Meta has committed to transforming its smart glasses application Meta View into a standalone application for Meta AI, hoping to better showcase the capabilities of Meta AI through such an application. However, in recent weeks, the application has performed poorly in handling analytical and complex tasks, particularly struggling with reviewing large volumes of documents and writing nuanced text.
Meta is also considering changing its previous approach by first releasing Llama 4 through Meta AI and then as open-source software, contrasting with its past strategy of simultaneous releases. Such a change could enhance the usage data of Meta AI but may also alienate researchers and developers who appreciate the company's open-source approach. However, it remains unclear whether Meta will advance this plan.
High Investment, Uncertain Returns
For Meta, the investment in the AI field is substantial. The company is building data centers for developing and operating its models, with capital expenditure plans reaching up to $65 billion this year. Even more astonishing, Meta is discussing a potential $200 billion data center project.
So far, Meta's achievements in AI have been mixed. In terms of consumer AI, the Meta AI assistant had over 700 million monthly active users as of January this year, but some of this usage is considered non-active, and the company has yet to launch a paid version that was discussed as early as last spring. Meanwhile, Meta also eliminated a group of chatbots that mimicked celebrities and influencers last year. The model has failed to gain significant traction in selling Llama to cloud service providers' customers.
Despite these challenges, according to a person close to the company, Meta still believes that Llama 4 will become an industry-leading model.
Despite the numerous challenges, a person close to Meta revealed that the company remains confident that Llama 4 will be an industry-leading model Risk Warning and Disclaimer
The market has risks, and investment requires caution. This article does not constitute personal investment advice and does not take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article are suitable for their specific circumstances. Investment based on this is at one's own risk