Stratechery Debunks the AI Bubble Myth: What Should We Do with AI?

By: blockbeats|2026/03/17 13:00:06

Original Article Title: Agent Over Bubbles
Original Article Author: Ben Thompson, Stratechery
Translation: Peggy, BlockBeats

Editor's Note: Against the backdrop of the continued hype around AI investment and industry narrative, the question of "whether there is a bubble" has become a central topic of market debate. On one hand, the extreme risk narrative continues to strengthen people's concerns about technological runaway; on the other hand, rapid capital expenditure and valuation levels have kept the "bubble theory" lingering. In the face of this divergence, market judgment shows significant uncertainty.

The author of this article, Ben Thompson, is the founder of the technology analysis platform Stratechery, and has long focused on the structural evolution of the technology industry and business models. On the occasion of NVIDIA GTC 2026, he revised his previous judgment on "whether AI is in a bubble": no longer seeing the current situation as a bubble, but understanding it as a round of structural growth driven by a technological paradigm shift.

This judgment is based on the observation of three key leaps of Large Language Models (LLMs). Since ChatGPT first demonstrated the capabilities of large language models to the market in 2022, LLMs have evolved from being "available but unreliable" to "having reasoning abilities," and then to "being able to independently perform tasks." Especially by the end of 2025, with the release of Anthropic Opus 4.5 and OpenAI GPT-5.2-Codex, agentic workloads began to move from concept to reality.

The key lies not in the model itself, but in the emergence of the "agent harness." The agent decouples users from the model, responsible for scheduling the model, calling tools, and validating results, transforming AI from a tool that requires continuous human intervention to an execution system that can be entrusted with tasks. This change not only improves reliability but also expands the application boundaries of AI.

Building on this paradigm shift, the author further points out that the expansion of AI demand is no longer determined by user scale, but more by the scheduling capacity of each user; at the same time, agentic workloads have a "winner-takes-all" feature, which will continue to drive up the demand for high-performance computing power and bring structural opportunities for chip manufacturers and cloud service providers.

In this framework, current large-scale capital expenditures are no longer just speculative bets on the future but are more likely a preemptive reflection of real demand. As AI transitions from an "assistance tool" to an "execution infrastructure," its economic impact may only just be beginning to show.

Original Text:

In the past, I was more inclined to the latter, and even thought that in some stages, a bubble might not be a bad thing.

But at this moment, standing in March 2026, at the opening of NVIDIA GTC, my judgment has changed: This might not be a bubble. (Ironically, this judgment itself may indeed be a signal of a bubble.)

LLM's Three Paradigm Shifts

Over the past few weeks, while discussing NVIDIA and Oracle's earnings reports, I have mentioned multiple times that LLM has undergone three key paradigm shifts.

Phase One: ChatGPT

The first inflection point was the release of ChatGPT in November 2022, which almost goes without saying. Although Transformer-based large language models had appeared as early as 2017 and their capabilities were continuously improving, they had been consistently underestimated. Even in October 2022, I still believed, even in an interview with Stratechery, that while the technology was impressive, it lacked productization and entrepreneurial momentum.

However, everything changed a few weeks later. ChatGPT made the world truly aware of LLM's capabilities for the first time.

However, the early versions also left two profound impressions, especially reiterated by the "bubble theorists":

First, the model often made mistakes and would even "hallucinate" answers when it didn't know the answer. This made it more like a "showy tool," amazing but unreliable.

Second, despite this, it was still very useful, but you had to know how to use it, constantly validate outputs, and correct errors.

Phase Two: o1

The second inflection point was the release of the o1 model by OpenAI in September 2024. By then, LLM had significantly progressed due to stronger base models and post-training techniques, resulting in more accurate outputs and fewer hallucinations.

But the key breakthrough of o1 was: it would "think" before answering.

Traditional LLMs are path-dependent, once they veer off course in the reasoning process, they keep going in the wrong direction. This is a fundamental weakness of "auto-regressive models." In contrast, the inference model self-assesses answers; it generates answers first, then judges their accuracy, and if necessary, tries other paths.

This means that the model starts actively managing errors, reducing the user intervention burden. The results are also very significant. If ChatGPT's breakthrough was in "making LLMs usable," then o1's breakthrough was in "making LLMs reliable."

Phase Three: Agent (Opus 4.5 / Codex)

By the end of 2025, the third leap emerged.

In November 2025, Anthropic released Opus 4.5, initially met with lukewarm reception. However, by December, the Claude Code running on this model suddenly exhibited unprecedented capabilities; almost simultaneously, OpenAI released GPT-5.2-Codex, showcasing a similar level of performance.

People had been talking about "Agents" all along, but at this moment, they finally began to truly complete tasks, even complex ones that took hours, and do so correctly.

The key lies not in the model itself, but in the control layer (harness), which schedules the model, calls tools, executes processes, and validates results. In other words, users no longer interact directly with the model but instead provide objectives for the Agent to schedule the model, call tools, execute processes, and validate outcomes.

Using programming as an example:

· Phase One: Model generates code

· Phase Two: Model reasons through the generation process

· Phase Three: Agent generates code → Performs testing → Automatically runs tests → Retries if incorrect, with minimal ongoing user intervention.

This means that the core limitations of the ChatGPT era are being systematically addressed, leading to higher accuracy, stronger reasoning abilities, and automatic validation mechanisms.

The only remaining question is: What should it be used for?

The Lowering Threshold of "Proactiveness"

The reason I emphasize these three inflection points repeatedly is to illustrate why the entire industry is facing a severe compute shortage and why massive-scale capital expenditure is justified.

The three paradigms have vastly different compute requirements:

· Phase One: Training intensive but low inference costs

· Phase Two: Soaring inference costs (more tokens + higher usage frequency)

· Phase Three (Agent): Multiple calls to inference models, Agent itself consuming compute (potentially CPU-heavy), further explosion in usage frequency

But more importantly, the third point: the shift in demand structure is severely underestimated.

Currently, far more people use chatbots than Agents, and many actually underutilize AI. This is because using AI requires "proactiveness." LLM is a tool; it has no objectives, no will, and can only be invoked proactively.

However, the Agent changed that by reducing the requirement for human agency. In the future, one person can command multiple Agents simultaneously.

This means that even if only a few individuals possess "agency," it is enough to drive significant computing power demand and economic output.

AI still requires a "human driver," but no longer needs "many humans."

Enterprise Payment Driver

The consumer side's willingness to pay for AI is limited, and this has become increasingly clear. The true payers for productivity are enterprises.

What excites enterprises the most is not just AI improving efficiency, but AI's ability to replace labor and do so more efficiently.

The current reality is that within large corporations, those truly driving the business forward are often a few; yet the organizations are large, leading to significant coordination costs. The role of the Agent is to amplify the influence of the "value-driving individuals" while reducing organizational friction.

The result is "fewer people → higher output → lower costs." This is also why future layoffs may not only be "cyclical adjustments" but rather structural changes.

Companies will rethink not only whether they "hired too many people during the pandemic" but also whether in the AI era, we simply do not need as many people?

Why Is This Not a Bubble?

From this perspective, the logic of "not being a bubble" becomes clearer:

1. The core flaws of LLM are being continuously addressed by computing power and architecture

2. The number of people required to drive demand is decreasing

3. The benefits brought by the Agent are not just cost reduction but also revenue increase

Therefore, it is not difficult to understand why all cloud providers are saying that computing power is in short supply and are consistently increasing capital expenditures.

Agent and Value Chain Restructuring

Another key question is, if the model eventually becomes a commodity, can OpenAI and Anthropic still make money?

The traditional view is that they cannot, but the Agent changed that. The key is that the real value is not in the model itself but in the integration of the "model + control system."

Profits often flow to the "integration layer," rather than the replaceable modules. Just like Apple, whose hardware is not commoditized because of its deep integration with software. Similarly, the Agent requires deep synergy between the model and harness, making OpenAI and Anthropic key integrators in the value chain rather than replaceable parts.

Microsoft's transition is a signal; it originally emphasized "replaceable models" but had to abandon that after launching a true Agent product.

This means that models may not necessarily be fully productized, as Agents require integrated capabilities.

The Final Paradox

I must return to the paradox at the beginning.

I have always believed that as long as people are still worried about a bubble, it is not a bubble; a true bubble is when no one questions it anymore.

And now, my conclusion is: this is not a bubble.

But if the very act of me saying "this is not a bubble" proves it is a bubble, then so be it.

[Original Article Link]

On June 9, The Kobeissi Letter, citing Goldman Sachs data, reported that global investors are selling South Korean stocks at an unusually rapid pace. In the latest trading session, foreign investors sold about $801 million worth of Kospi constituent stocks again; total foreign outflows last week reached about $10 billion, and the market has been in net foreign selling on nearly every trading day over the past month. According to the data cited in the report, foreign investors have sold about $75 billion worth of South Korean stocks so far this year. Meanwhile, South Korean retail and institutional investors together recorded roughly $69 billion in net buying over the same period, suggesting that the market’s main buying support has come from domestic capital rather than returning overseas funds. The information currently disclosed still mainly comes from The Kobeissi Letter’s retelling and Goldman Sachs data summaries, while public details on the statistical period and the specific definition of “selling” remain relatively limited.

Fortune Warns of Strategy’s Financing Structure Risks as Bitcoin Premium Narrows

Fortune warned that Strategy’s Bitcoin treasury model faces growing financing risks as MSTR’s net asset premium narrows and preferred stock dividend pressure increases.

Ferrari Challenge Le Mans: Carl Moon to Dominate in WEEX Livery

The art of absolute control. Inside Carl Moon’s Ferrari 296 Challenge quest at Le Mans, taming the storm together with the official WEEX livery.

Sahara AI Responds to SAHARA’s Sharp Drop: No Contract or Product Security Issues Found, Internal Investigation Underway

Sahara AI responded to SAHARA’s 60% price drop, saying no token contract or product security issues have been found and an internal investigation is underway.

WEEX Deposit/Withdrawal Dynamic Island: Your Asset Status, Always in Sight

WEEX introduces Deposit and Withdrawal Info on Dynamic Island for iOS. See fund transfer progress on your dynamic island, lock screen, or while using other apps. No more guessing. No more refreshing.

Scaling Crypto Derivatives: The Digital Asset Infrastructure Behind High-Volume Trading

In the fast-moving digital asset ecosystem, derivatives platforms face an extreme architectural test. High-leverage futures markets demand more than just standard security—they require absolute operational precision, zero-latency matching engines, and ironclad structural scalability, all while navigating intense market volatility.

As global platforms scale to meet these demands, the industry is shifting away from rigid, monolithic setups toward a more agile, "decoupled" infrastructure philosophy.

The Blueprint for High-Volume Copy Trading

For elite global exchanges like WEEX (founded in 2018), this architectural choice becomes critical when scaling high-volume retail features like social copy trading. When thousands of users automatically mirror the real-time strategies of elite traders simultaneously, it triggers sudden, monumental spikes in concurrent transactional volume.

To prevent execution latency or settlement bottlenecks during these peak volatility events, a platform's primary engine must remain entirely dedicated to risk management, copy-trade synchronization, and order matching.

The Architectural Rule: New-generation platforms must separate front-end user execution engines from heavy backend infrastructural overhead to eliminate operational friction.

By separating these layers, platforms can maintain complete sovereignty over their trading environments and user experiences while strategically aligning with institutional-grade infrastructure ecosystems. This strategic framework allows modern exchanges to leverage advanced Digital Asset Custody infrastructure such as Cobo’s behind the scenes, ensuring that backend wallet management scales elastically alongside trading spikes.

Capitalizing on Market Momentum and 400× Leverage

In a derivatives arena where platforms offer up to 400× leverage on perpetual contracts, capital efficiency and market agility are core business metrics. To capture market momentum, an exchange needs the ability to rapidly expand its asset offerings, supporting everything from legacy crypto assets to sudden, trending altcoins across a massive library of trading pairs.

Adopting a flexible, scalable Wallet-as-a-Service (WaaS) solution such as Cobo’s could completely rewrite the development timeline for high-growth exchanges. Instead of spending months of engineering capital building out custom backend wallet architectures for every new blockchain network, platforms can deploy localized infrastructure in days.

This agility allows platforms to instantly scale their listings to over a thousand trading pairs without compromising security or delaying time-to-market. It mirrors the exact operational advantages seen during high-velocity market events, similar to how advanced wallet infrastructure empowers platforms during sudden asset surges; allowing exchanges to pass that speed and liquidity directly to their global user base.

A Mature Foundation for Growth

The synergy between trusted infrastructure ecosystems and global trading platforms represents the natural evolution of a maturing crypto market. As WEEX continues to scale its global spot and derivatives offerings for over 6 million users, adopting robust backend paradigms proves that platforms no longer have to compromise between cutting-edge trading velocity and uncompromised structural security.

Get Paid to Onboard? Try WEEX’s New Homepage with Rewards for Registration, Deposit & Trade

WEEX just launched a brand new homepage and a 3-step new user onboarding guidance. Complete Registration → Deposit → Trade to earn exclusive rewards. Faster navigation, clear progress, and instant bonuses. Download the latest WEEX App to try it now.

WEEX Custom Layout: Build Your Perfect Trading Workspace in Seconds

WEEX introduces custom layout on futures trading page: left/right panel switch, hide/show core modules, full-screen focus, and one-click reset. Trade your way now.

Morning Report | BitMine increased its holdings by 126,971 ETH last week; trader Eugene announced his exit from the crypto market

Overview of Important Market Events on June 8th

Wang Chuan: How can one not feel anxious after the neighbor Old Wang made thirty times profit by investing in storage stocks? (Seven) - A quarter-century cycle