Meta is focusing on AI Agents, and Llama 4 will enable direct voice conversations

Wallstreetcn
2025.03.07 06:44
portai
I'm PortAI, I can summarize articles.

Reports indicate that Meta plans to introduce improved voice features in its upcoming Llama 4. The model will focus on making conversations between users and its voice model more interactive, approaching natural dialogue rather than just one-way Q&A

Meta is focusing on AI Agents, especially in edge AI and voice interaction. The company plans to introduce more powerful voice features in Llama 4, which will be launched in the coming weeks.

According to a report by the Financial Times on the 7th, Meta's Chief Product Officer Chris Cox revealed at the Morgan Stanley Technology, Media, and Telecom Conference that Llama 4 will be an "all-purpose model," with voice functionality being native.

This means that Llama 4 will be able to directly process voice information, without the need to first convert speech to text, then input the text into a large language model (LLM) for processing, and finally convert the output text back to speech.

Cox believes that this native voice functionality is significant for human-computer interaction interfaces, allowing users to converse directly with AI and ask any questions.

“I believe this is a big deal for interface products; you can talk to the internet and ask it anything. I think we are still thinking about how powerful this can be.”

Llama 4: Native Voice Functionality

Zuckerberg has emphasized multiple times that 2025 will be a "decisive year" for Meta's AI products. To commercialize AI technology, Meta is considering various options.

According to reports citing informed sources, Meta has been particularly focused on making conversations between users and its voice models closer to two-way natural dialogue, allowing users to interrupt, rather than a more rigid Q&A format.

Additionally, Meta is exploring launching a premium subscription service for its AI assistant Meta AI, offering features such as booking services and video creation. Meta is also considering introducing paid advertisements or sponsored content in the search results of its AI assistant.

Zuckerberg also revealed this year a plan to build an AI engineering agent with intermediate engineer capabilities, which is believed to have "very large market potential."

Meta's AI business head Clara Shih stated in an interview with CNBC on the 6th that there are 200 million small businesses globally already using Meta's services and platforms. She expects AI to change every job and every business, including the millions of small businesses that connect with customers using WhatsApp, Instagram, and Facebook.

Analysts believe that Meta's AI voice plan not only highlights Meta's ambitions in the AI field but also indicates that future AI interactions will place greater emphasis on natural dialogue rather than the traditional text-dominated model.

Competitive Landscape: The Voice Battle Between OpenAI, xAI, and Meta

In the context of increasingly fierce competition in the AI industry, Meta is striving to respond to challenges from competitors. OpenAI's voice model released last year focuses on giving its AI different personality traits, while Grok 3, launched by Musk's xAI, recently also introduced voice functionality.

These competitions have prompted Meta to engage in in-depth discussions about the safety and usage restrictions of new models. On one hand, Meta needs to ensure that the outputs of its AI models adhere to ethical standards and avoid generating harmful or inappropriate content. On the other hand, Meta also hopes to lower the "nobility" level of the models, allowing them to answer users' questions more freely Previously, Meta faced criticism for its third version of the Llama model, which was deemed too "noble" and refused to answer some innocent questions.

In addition to improvements in voice capabilities, Meta's investments in the AR/VR and smart glasses sectors are also significant. The recently launched Ray-Bans smart glasses interact with AI assistants through voice commands and are accelerating the development of lightweight head-mounted devices, aiming to replace smartphones and become the mainstream computing device for users