LeCun was heavily criticized: You messed up Meta! Burned through hundreds of billions in computing power, revealed a complete failure after 20 years of effort

Wallstreetcn
2025.04.19 06:46
portai
I'm PortAI, I can summarize articles.

Meta's Chief AI Scientist Yann LeCun has faced widespread criticism for his performance in the field of LLM (Large Language Models). Despite having hundreds of billions of dollars in resources, Meta is lagging behind companies like OpenAI and Google in the AI competition. LeCun admitted that his attempts at autoregressive prediction have failed and expressed skepticism about the development direction of LLMs, which is considered one of the reasons for Meta's failures. Industry insiders point out that LeCun's dogmatism and rejection of new technologies may put Meta at a disadvantage in the competition

With the powerful capabilities of the GPT/o series, Gemini 2.5, and Claude, OpenAI, Google, and Anthropic are all making waves in the AI battle.

Only Meta has not been at the forefront.

The various underwhelming performances of Llama 4 since its release have, to some extent, turned Meta into a "laughingstock" in the industry. There are even former researchers who have specifically noted on their resumes that they did not participate in the training of Llama 4.

Against this backdrop, criticism of Yann LeCun in the industry has been growing stronger.

Although this Turing giant is a top scientist, his ability to mobilize hundreds of billions in capital for computing resources and internal research has still resulted in Meta's failure.

So, where exactly does his problem lie?

Some say that Meta's lag in the LLM battle is precisely because LeCun has been actively expressing his aversion and rejection of the direction in which LLMs are advancing.

Looking back decades from now, LeCun's attitude today may be seen as correct, but compared to radical figures like Ultraman who hold a hardcore stance, this mindset will undoubtedly put Meta at a disadvantage in the current competition.

If the chief artificial intelligence scientist of a large company does not believe in the architecture they are working on, and does not have sufficiently impressive results for the architecture they believe in, then the current situation is clearly a result of that

LeCun's skepticism towards the LLM route has been commented by many netizens as "dogmatism."

"Many top scientists have this flaw: because they are too self-centered, they believe they know everything best, making it difficult for them to transform. Sometimes, the dogmatism of these top figures directly affects scientific progress."

In response, some have summarized several core reasons for Meta's failure.

  • LeCun's opposition to LLM rhetoric

  • Meta is still a novice in MoE architecture

  • Early open-source release leading to failure

  • Lack of coordination between research and product teams, poor organization and management

Of course, the emergence of the first generation of Llama still has groundbreaking significance for the open-source community, but in the rapidly exploding AI circle, this seems like something that happened five hundred years ago.

Moving forward, unless LeCun can truly succeed in pioneering a new paradigm and achieve a world model like JEPA, Meta will only continue to lag behind in the AI competition.

Now, let's take a look at LeCun's various statements recently declaring LLM "sentenced to death."

First, LLM is already a thing of the past

At the NVIDIA 2025 GTC conference, LeCun expressed this viewpoint: "I am no longer interested in LLM!"

He believes that LLM is already a thing of the past, and our future lies in four more interesting areas: machines that understand the physical world, persistent memory, reasoning, and planning Interestingly, Llama's download count has now reached one billion, which indicates a strong demand for LLMs, somewhat contradicting the view that "LLMs are outdated."

Back to the point, during the speech, LeCun stated that throughout the history of AI, almost every generation of researchers has claimed, when discovering new paradigms, "This is it! In five or ten years, we will be able to create machines that are smarter than humans in all fields."

For seventy years, this wave has appeared roughly every decade, and this one will also be proven wrong.

Therefore, the argument that "as long as we continue to infinitely scale up LLMs, or let them generate thousands of Token sequences and then pick the good ones, we can achieve human-level intelligence; within two years, a genius nation will emerge in data centers" is, in his view, complete nonsense.

Especially since the Tokens used by LLMs are not a good method for depicting the physical world. The reason is simple: Tokens are discrete.

In a typical LLM, the selectable Tokens are only around the hundred thousand level. Thus, when you ask the model to predict the next Token in a text sequence, it can only output a probability distribution, but can never give that one correct Token with 100% certainty.

This approach is not a big problem for text; however, in the face of high-dimensional and continuous natural data like video, all attempts to make systems understand the world and build world models through pixel-level predictions of video have almost all failed.

Similarly, even just training neural networks to learn high-quality representations of images, any technique that relies on reconstructing the original image from damaged or transformed images has also basically ended in failure.

Secondly, autoregressive LLMs are absolutely "pills"

At the end of March this year, at the 2025 Joint Mathematics Meetings of the American Mathematical Society, LeCun delivered a speech titled "Mathematical Barriers to Human-Level Artificial Intelligence."

In LeCun's view, the current level of machine learning is still poor. Its learning efficiency is extremely low—models often need to go through thousands of samples or experiments to achieve target performance.

Although self-supervised learning has indeed changed AI, it is still very limited. Animals and humans, on the other hand, can quickly master new tasks, understand how the world works, reason, plan, and possess common sense—their behavior is goal-driven.

In contrast, autoregressive LLMs are trained by predicting the next word or symbol in a sequence; this sequence can be discrete symbols such as text, DNA, musical scores, or proteins.

However, autoregressive prediction has a fundamental problem.

Essentially, it is divergent: imagine the generated symbols are discrete, and for each output symbol, there can be up to 100,000 possibilities.

If we consider all possible Token sequences as a giant tree with a branching factor of 100,000, there exists only a small subtree corresponding to all continuations of the "qualified answer."

The problem is that this "correct subtree" is merely a tiny subset of the entire tree.

Assuming that each generated symbol has an independent error probability e, the probability of a sequence of length n being completely correct is (1‑e)^n.

Even if e is extremely small, this probability will still decay exponentially with n, and under the autoregressive framework, there is no remedy for this.

Therefore, LeCun's judgment is that autoregressive large language models are destined to be eliminated! In a few years, no rational person will use them anymore.

This is the source of the so-called hallucinations of LLMs: they will talk nonsense, which is essentially caused by autoregressive prediction.

In LeCun's view, we have overlooked something very important in constructing new concepts for AI systems.

Simply stuffing LLMs into larger datasets will never allow us to reach human-level AI. But right now, let alone replicating mathematicians or scientists, we can't even imitate a cat.

Domestic cats can plan complex actions, possess causal models, and foresee the consequences of their own behavior, while humans are even more exceptional; a 10-year-old child can clean the dining table and load the dishwasher on the first try—that is zero-shot learning Now, AI can pass the bar exam, solve mathematical problems, and prove theorems—so where are the L5 autonomous vehicles? Where are the household robots?

We still cannot create systems that can truly cope with the real world. It turns out that the physical world is far more complex than language.

This is the Moravec's Paradox.

Tasks that humans find troublesome—such as calculating integrals, solving differential equations, playing chess, and planning routes across multiple cities—are easy for computers.

This indicates that if someone refers to "human intelligence" as "general intelligence," it is purely nonsense—we do not possess general intelligence but rather highly specialized skills.

A typical modern LLM is trained on approximately 2×10¹³ (about 20 trillion) tokens. If we calculate 3 bytes per token, the total amount is 6×10¹³ bytes, rounded up to about 10¹⁴ bytes. It would take any individual hundreds of thousands of years to read all this text.

However, a 4-year-old child, despite being awake for only 16,000 hours, accumulates an equivalent amount of information from the physical world through sensory inputs such as vision, touch, and hearing, also reaching about 10¹⁴ bytes during that time.

In other words, if AI cannot learn the rules of its operation by observing the world, we will never reach human-level intelligence—because there simply isn't that much information in text.

At Meta, they do not use the term AGI but rather Advanced Machine Intelligence (AMI):

• Able to learn world models and mental models through sensory input, thus mastering intuitive physics and common sense;

• Possessing persistent memory;

• Capable of planning complex sequences of actions;

• Equipped with reasoning abilities;

• Designed from the outset to ensure controllability and safety, rather than relying on fine-tuning afterward.

LeCun expects that within three to five years, Meta will be able to run this on a small scale; after that, it will depend on how to scale it up until it truly achieves human-level intelligence.

The cognitive architecture of AMI can be summarized as follows.

• World model;

• Several objective functions;

• Action body - responsible for optimizing actions to minimize costs;

• Short-term memory, corresponding to the hippocampus in the brain;

• Perception module - almost the entire back part of the brain is doing this;

• And a configurator.

So, how can the system learn the mental model of the world from sensory inputs like videos?

Can we borrow the idea of autoregressive prediction, training generative architectures like LLMs, to predict what will happen next in the video, such as the next few frames?

The answer is no.

LeCun stated that he has been struggling with this for 20 years and has completely failed.

It is suitable for predicting discrete symbols, but we do not know how to meaningfully predict a video frame in the high-dimensional continuous space representing probability density functions.

His solution is a technique called JEPA (Joint Embedding Predictive Architecture).

LeCun stated that if his judgment is correct, and using JEPA is indeed more reliable than generative architectures, then everyone should completely abandon generative architectures.

He also admitted that in today's environment where everyone is talking about GenAI, he is telling them: "Give up GenAI!" which is quite like an outlier.

In summary, LeCun gave a powerful shout at the end of his speech If you are interested in human-level AI, do not focus on large language models.

If you are a PhD student in the field of AI, you absolutely should not engage in LLM work, as you are placing yourself in competition with large teams that have thousands of GPUs, and you will not be able to make any contributions.

He stated that if we can solve some real problems in the next five or ten years, we can embark on a path toward truly intelligent systems that can plan and reason.

And the only viable method is open source.

LeCun stated that if he succeeds, AI will become a tool that amplifies human intelligence, which will only benefit humanity.

A Long-Buried Memory

LeCun shared a long-buried memory that gives us some insight into his inner world.

In 2022, LeCun and several colleagues at Meta trained an LLM, stuffing in all the scientific literature they could find.

This model was named Galactica. They wrote a lengthy paper detailing the training process, open-sourced the code, and launched an online demo that everyone could try.

As a result, the project was mercilessly criticized on Twitter.

Many shouted, "This thing is terrifying; it will destroy the entire scientific communication system," because even fools could write a seemingly decent scientific paper titled "Eating Broken Glass is Beneficial to Health."

Negative comments came in like a tsunami, and the poor colleagues lost sleep at night, ultimately being forced to take down the demo, leaving only the paper and open-source code.

At that time, their conclusion was that the world was not ready to accept this technology, and no one was truly interested.

Three weeks later, they were hit with a shock: ChatGPT was launched, and the public's reaction was clearly that of a "savior's return."

LeCun and his colleagues looked at each other, puzzled by the sudden enthusiasm from the public.

Is Meta Really Failing? Not Necessarily

Despite the constant skepticism, LeCun also has some steadfast supporters.

As someone expressed emotionally after listening to his speech—

"I truly admire LeCun, a realist, an advocate for open source, and certainly not someone who follows trends for hype. Although he has faced a lot of hatred for opposing the LLM dogma, I still respect his honesty."

"I'm glad to hear someone discussing the limitations of LLMs in today's era, especially since he is still working for a shareholder company. Only when we ignore the hype and focus on limitations, the possibility of failure, and other engineering principles will AI be safe."

Even in the face of the currently underperforming Llama 4, these supporters firmly believe that we will see impressive progress in a few months.

In a post titled "Even if LLMs reach a plateau, it doesn't necessarily mean an AI winter," someone firmly supported LeCun's approach.

According to the poster, while today's large labs are focused on LLMs, there are still some smaller labs exploring alternative paths.

He stated that he always thought Meta's LeCun team was the only one researching self-supervised, non-generative, visual systems.

But just a few weeks ago, a group of researchers released a new architecture built on many ideas that LeCun has long advocated.

Paper link: https://arxiv.org/abs/2503.21796 In some cases, it even surpasses LeCun's own models.

Moreover, in recent years, there have been more and more systems similar to JEPA emerging, which LeCun also mentioned in the video.

Some of these come from smaller teams, while others come from Google.

If one day, the path of LLM really becomes unfeasible and stagnates, we might see a decline in funding, as much of the current investment is actually based on public and investor enthusiasm.

However, this does not mean an AI winter. The reason for past winters was that people had never truly been "shocked" by AI.

But since the birth of ChatGPT, people have seen such "intelligent" AI, which has attracted unprecedented attention to the AI field, and this enthusiasm shows no signs of waning.

Rather than saying we are entering an AI winter, it is more accurate to say we are witnessing a shift from a dominant paradigm to a more diversified pattern.

This is a good thing for all of humanity. When it comes to fields as difficult to replicate as intelligence, the smartest approach is not to put all your eggs in one basket.

References: https://x.com/kimmonismus/status/1913265562853011863 https://www.youtube.com/watch?v=eyrDM3A_YFc&t=6s https://www.youtube.com/watch?v=ETZfkkv6V7Y

**Source: New Intelligence (ID: gh_108f2a2a27f4), Original Title: "LeCun Criticized: You Ruined Meta! Burned Through Hundreds of Billions in Computing Power, Admits 20 Years of Complete Failure"

Risk Warning and Disclaimer

The market has risks, and investment requires caution. This article does not constitute personal investment advice and does not take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article are suitable for their specific circumstances. Investment based on this is at your own risk