Is Token Burning Losing Steam? This May Be the Most Important Chart in the Entire Market

The Silicon Valley Token Spending Index surged significantly before May but has recently declined. This index measures the price paid per million LLM tokens across the entire market. Macro strategist Andreas Steno Larsen warns that if token pricing continues to weaken, the current cycle of trades spanning from memory to broader hardware and data centers may come to an end

Token spending growth is showing signs of fatigue, and the market's core focus on AI is rapidly shifting from "technical feasibility" to "cost affordability."

On June 9, macro strategist Andreas Steno Larsen stated on social media that the trend of the Silicon Data LLM Token Spending Index is currently the most important chart for the entire market to watch.

The index has more than doubled since last December and surged significantly before May 2026, but it has recently pulled back. Andreas Steno Larsen warns that if token pricing continues to weaken, the current cycle of trades spanning from memory to broader hardware and data centers may come to an end.

Meanwhile, tech giants are urgently curbing out-of-control internal AI computing consumption.

As previously mentioned by Wallstreetcn, Amazon and Microsoft are cutting back on internal AI tools or halting projects that track usage to combat "Tokenmaxxing"—a behavior where employees inefficiently consume computing power to boost their internal rankings. On the service side, GitHub Copilot switched its billing model from per-request to per-token on June 1, causing some users' monthly bills to skyrocket by more than tenfold, sparking widespread skepticism about the sustainability of AI subsidy models.

These signals are reshaping investors' risk assessments regarding AI infrastructure trades. Marginal changes in token spending directly impact capital expenditure expectations for NVIDIA, memory chip manufacturers, and cloud service providers through the transmission chain of GPU computing power, DRAM memory, and data center demand.

Indicator Peaking: Hardware Trade Logic Faces Test

The Silicon Data LLM Token Spending Index is an expenditure-weighted metric that measures the price paid per million LLM tokens across the entire market, serving as a proxy for the market's marginal willingness to pay for AI. Since major suppliers like OpenAI, Anthropic, and Google mostly bill customers based on token consumption, token spending directly binds AI usage to the demand for GPUs, DRAM, and data centers.

The recent stagnation of this index has triggered concerns in capital markets about the hardware cycle. Comments from Silicon Data suggest that the recent pullback may indicate a slowing pace of migration to high-end closed-source models. If token spending remains weak, the marginal revenue funding incremental purchases of GPUs, DRAM, and data centers will diminish, altering the risk profile of companies that have structured their capital expenditure plans around token-driven growth.

Although a single decline does not constitute an absolute trend, as a leading indicator for the hardware cycle, this data suggests that corporate reliance on high-cost frontier models may face a systemic decrease.

Billing Crisis: Tech Giants Halt "Inefficient Consumption"

The corporate AI boom is encountering its first true billing crisis.

According to Axios, citing an AI consultant, one of their corporate clients recently spent $500 million on Claude in a single month, simply because there were no caps on employee usage.

Within enterprises, the practice of using AI usage as a performance metric has also backfired. Reports indicate that Amazon's developer platform, Kiro, once had an internal leaderboard called "Kirorank." Similar situations emerged within Meta, where attempts were made to inflate token consumption to gain ranking advantages.

Amazon Senior Vice President Dave Treadwell acknowledged that employees were assigning meaningless tasks to AI to climb the rankings, thereby driving up the company's operating costs. He explicitly instructed employees "not to use AI for the sake of using AI," and the beta dashboard was subsequently taken offline. Amazon has now shifted to tracking the actual value of AI-generated code using a "normalized deployment" metric instead of token consumption.

Pricing Rebound: The End of the Subsidy Era

On the supply side, the AI industry's long-standing business model of exchanging subsidies for growth is nearing its limit.

On June 1, GitHub Copilot officially switched to billing based on token usage. A user on Reddit stated that their monthly fee was expected to skyrocket from under $45 to over $847.

GitHub Chief Product Officer Mario Rodriguez previously stated that with the rise of agent AI, the old pricing model was no longer sustainable. Gartner analyst Arun Chandrasekaran, in an interview with Business Insider, pointed out that as advanced reasoning models drive up computing consumption, more enterprises will shift to usage-based billing.

Investor Tommy Shaughnessy warned of the systemic risks of this subsidy model. He noted that the profit margins of major AI companies are deeply negative. Once enterprises face the real prices of usage-based billing, actual consumption rates will far exceed expectations. For example, Uber exhausted its full-year AI budget for 2026 within four months. If investors lose confidence in return expectations, the capital flows supporting GPU purchases and model training face reversal.

Cost Restructuring: Cheap Models May Dominate the Market

Faced with high inference costs, the market is seeking low-cost alternatives.

Rich Privorotsky, head of One-Delta at Goldman Sachs, believes that as DeepSeek lowers its pricing by 75% and Xiaomi's MiMo sees a price cut of nearly 99%, the easing of infrastructure bottlenecks is triggering a price war.

As previously mentioned by Wallstreetcn, Coinbase CEO Brian Armstrong predicts that 80% of AI workloads will migrate to models that are 99% cheaper within 12 to 18 months, with only 20% of tasks requiring extreme intelligence remaining on frontier models. He pointed out that energy and computing power will become the true bottlenecks.

Hugging Face CEO Clement Delangue cited Stanford University data to confirm this trend: the accuracy of local models in real-world queries has jumped to 71.3%, at an extremely low cost. Micro1 CEO Ali Ansari views this as a "healthy swing" from overuse to rational use.

Regarding the true return on investment for AI, Wall Street remains severely divided. According to Jim Schneider at Goldman Sachs, agentic AI will drive a 24-fold increase in token consumption by 2030, and cloud service providers' gross margins will turn positive in the short term. Economic research from JPMorgan also shows that the leapfrog growth of Python packages on PyPI demonstrates genuine productivity improvements.

However, the bearish camp remains equally firm. In a report, Goldman Sachs semiconductor analyst Jim Covello pointed out that the current prosperity of the industrial chain comes at the expense of upstream consumption, with almost all value flowing to semiconductor companies, a situation that is unsustainable.

Boosted.ai CEO Josh Pantony emphasized that corporate concerns about data openness weaken the effectiveness of AI agents. Under the multiple considerations of cost, return, and security, the amount of real value generated by the next AI bill will be the market's final verdict on this technological investment.