Why is the subscription model for AI services destined to disappear?

Author: Zhang Yongyi, GeekPark

On June 9th, Anthropic released its most powerful publicly available model to date, the Claude Fable 5. As is customary, this should be a celebration for paying users—your monthly payments have finally earned you the privilege of being among the first to experience the flagship.

However, one line in the announcement immediately sparked huge controversy after its release: After June 22, Fable 5 will be removed from all subscription plans, and continued use will require the purchase of usage credits separately.

In other words, even if you buy a membership, you can only use the flagship model for 14 days.

This is the first time in the large model industry that a model has been released with an "eviction notice" on the very day it was released.

Many people see it as a mistake by Anthropic, or an act of arrogance. My view is quite the opposite: it wasn't a mistake, it was a harbinger.

AI subscription models are heading towards their inevitable demise—not because any company is greedy, but because the very premise upon which subscription models are built is being dismantled by AI itself.

Flagship model, 14 days to go

Let's clarify the facts first. According to Anthropic's official schedule (June 9, 2026), Fable 5 will be included for free in Pro, Max, Team, and per-seat Enterprise plans from the date of release until June 22. Starting June 23, it will be removed from these plans, and every token thereafter will be deducted from prepaid usage credits at the same rate as the API.

This fee isn't cheap: $10 per million input tokens and $50 per million output tokens, exactly twice that of the previous flagship Opus 4.8. Even more subtly, even during the free trial period, Fable 5 is weighted roughly twice in subscription credits—the same work burns through credits twice as fast as Opus.

The users' reactions were predictable. Some on Hacker News bluntly stated that this "give first, collect later" practice was unsettling, suspecting that Anthropic was trying to push subscribers towards pay-as-you-go billing; some developers even tested it and found that on the $100-per-month Max plan, a single agent programming session consumed nearly $100 worth of tokens .

 Users are complaining on social media that they don't have enough tokens to go around | Image source: Twitter

Moreover, this isn't just Anthropic's move. For the past eight weeks, the entire industry has been doing the same thing: On April 2nd, OpenAI changed Codex from message-based billing to token-based billing aligned with its API, and subsequently expanded it to all existing enterprise customers.

GitHub froze new registrations for the personal version of Copilot on April 20, and a week later announced a full switch to AI Credits billing, which was completed on June 1—the Pro tier costs $10 per month, which comes with $10 in credits.

Anthropic itself has been the most active: Starting April 4th, it prohibited third-party agent frameworks such as OpenClaw from consuming subscription credits, and such usage will now be subject to pay-as-you-go pricing; on April 21st, the Claude Code section for the Pro plan on the pricing page quietly turned into a red cross, causing an uproar in the community, and was subsequently withdrawn within 24 hours, with the official explanation being "a small test for approximately 2% of new registered users"; on May 14th, it was officially announced that starting June 15th, the Agent SDK and headless calls will be removed from the subscription pool and replaced with independent credits measured by API rates.

Three companies, eight weeks, the same direction—this is not a coincidence, but rather the entire industry offering the same answer to the same mathematical problem.

What does that math problem look like?

Pricing is never based on computing power.

The research firm SemiAnalysis recently put this math problem to the table. They bought one subscription to each of Anthropic and OpenAI, ran long-duration programming tasks until they exhausted their weekly limits, and then calculated the value of these usages based on the API rates.

Previously, the industry consensus was that a $200 monthly plan could at most generate around $2,000 in tokens. However, actual test results far exceeded this: the $20 Claude Pro reached a maximum of around $400; the $200 Max 20x reached around $8,000. OpenAI's results were even more impressive—the $20 ChatGPT Plus generated around $700, and the $200 Pro 20x reached around $14,000.

 The highest subsidy multiplier is 70 times | Image source: SemiAnalysis

Two things need to be said upfront: this is the upper limit of "full utilization," not the daily usage level for ordinary users; the API price includes gross profit, and the converted figure does not equal the actual computing power cost. However, the pricing must cover the upper limit—insurance companies cannot assume that no one will make a claim.

 SemiAnalysis Real-world comparison of usable data across different subscription tiers | Image source: X @kimmonismus / SemiAnalysis

Subsidies themselves aren't fatal. Streaming media companies have subsidized, ride-hailing apps have subsidized; burning money for growth is an age-old technique in the internet industry. What's truly fatal is the fundamental difference between AI subscription models and these other approaches.

Netflix dares to sell monthly subscriptions because of two things: the marginal cost of adding an extra show is close to zero, and a person only has a maximum of 24 hours a day to watch. Spotify operates on the same principle. The implicit premise of a monthly subscription model is that consumption is locked into human physiological limits—what it truly prices is never the content, but the person's time.

AI in the chatbot era barely meets this premise. No matter how much a person can chat, the amount of typing they do in a day is limited; the large amount of idle credit available to light users is enough to cover the excessive spending of heavy users.

Then, the Agent arrived.

What's a typical agent task like? It reads 20 files, makes plans, modifies code, runs tests, reads error messages, and iterates again— one round consumes 5 to 30 times more tokens than a normal conversation . Even worse, it doesn't require your presence. I've experienced this myself: recently, I had an agent compile flight data for two airports. I went to take a shower, and when I came back, the task was finished, and my credit limit was depleted. You're sleeping, but the electricity meter is running.

Agents don't eliminate price caps, they eliminate consumption caps. And the entire evolution of the AI industry—longer tasks, greater autonomy, multiple parallel instances—is racing towards the same endpoint:

Completely remove people from the consumption process .

GitHub stated it quite plainly in its announcement: the agent usage is "becoming the default." In other words, the scenarios where subscription models can still barely hold up—that is, people sitting in front of a screen chatting one sentence at a time—will only account for a smaller and smaller share of the value proposition of AI.

At this point, some people might ask: If the subsidies are too deep, why not just raise the price?

It was mentioned before, and then an even worse result was obtained. Looking back at the SemiAnalysis table, there was an unusual detail: the more expensive the tier, the higher the subsidy multiplier.

For Claude, the multiplier for the $20 plan is 20x, and for the $200 plan it's 40x; for OpenAI, it's increased from 35x to 70x. Half of this is due to pricing design—the higher tiers offer multiplied spending, essentially a discount for large clients; the other half is user behavior—people who spend $200 on the 20x plan are aiming for maximum usage, while casual users simply won't be in that tier.

This has a name in the insurance industry: adverse selection. When a policy's pricing attracts only the highest-risk policyholders, that policy has no actuarial viability. Any fixed price will precisely filter out the group of users whose usage exceeds its value—this isn't a management problem, it's a structural problem; adjusting prices will only make the filter even finer.

Throughout 2025, the industry essentially tried every patch imaginable. In January, Sam Altman admitted on X that the $200-per-month ChatGPT Pro was losing money because usage far exceeded expectations—a price increase tier had failed.

 OpenAI tried, but failed | Image source: X

In the middle of the year, Cursor changed its billing method from per-request to per-computation, which triggered a large number of unsubscribes, and the CEO publicly apologized—the rule change failed halfway through; in the summer, Anthropic added a weekly limit to Claude Code, citing that some users were running the agent around the clock, consuming tens of thousands of dollars of computing power per person—the rate limiting only provoked anger.

Only after all the patches failed did this collective showdown take place over the past eight weeks. Nick Turley, head of OpenAI's ChatGPT, put it bluntly on the BG2 podcast: " In this day and age, offering unlimited data plans is probably like offering unlimited electricity plans ."

The shell is still there, but the core is dead.

Of course, there's also a seemingly compelling counter-argument: subscription models are thriving. ChatGPT Plus is still $20 a month, Claude Pro is still being sold, and GitHub's code completion service even retains a monthly subscription. Is the notion of its demise just alarmist rhetoric?

This rebuttal deserves serious consideration because the phenomenon it describes is real. However, it misinterprets what is dead.

The essence of the subscription model has never been the "monthly deduction" format, but rather the promise of "fixed price, worry-free use"—you don't have to calculate the cost of each use, which was the whole reason it defeated pay-per-use back then.

What's happening now is that the deduction period remains, but the promise has been withdrawn .

GitHub Pro's $10 monthly fee includes $10 in credits, available only once used up—this isn't a subscription, it's a prepaid card disguised as one. Anthropic's credits are deducted based on API fees, and OpenAI's credits support automatic top-ups. The subscription model won't be abolished; it will be hollowed out: the shell remains, but the core is dead.

 GitHub Copilot's official announcement regarding its shift to AI Credits billing | Image source: GitHub

There's one true enclave left: pure chat. It can still be offered on a monthly basis because it's the last AI-driven consumption scenario where human time still plays a role. But even a stronghold can't protect this enclave—every penny invested in R&D in this industry is pushing AI from "you ask, it answers" to "it proactively helps you complete tasks." Chat subscriptions won't be killed off; they will be marginalized: remaining where they are, watching the real value and real revenue gradually move into the pay-as-you-go world.

Another coincidence in timing is hard to ignore: according to a TechCrunch report (June 2026), Anthropic was preparing for an IPO with OpenAI around the time Fable 5 was released. For the past three years, subsidies have been funded by venture capital; investors in the public market will not accept a profit and loss statement that shows "more losses for each additional heavy user." The timeline for capital exit dictates that the showdown will not be postponed indefinitely.

This means different things to different people. For businesses, AI spending will henceforth be managed like cloud spending—according to The Information, Uber's CTO stated in an internal memo that the company burned through its entire 2026 AI budget in just four months, and budgeting, installing monitoring systems, and implementing task routing models will become mandatory for every team. For individual users, in the past, light users subsidized heavy users; now, everyone pays for their own electricity meter .

 Uber's AI budget shift has also sparked considerable controversy | Image source: The information

To be honest, this isn't necessarily all bad. After price signals returned, the question of "Is this task worth letting AI run?" became a real question for the first time—and when an industry starts to seriously answer this question, it's often the beginning of it moving away from the narrative of burning money and towards normal business.

At this point, I'd like to add a comment: before the electricity meters are installed, the current subscription model may be the most generous the industry has ever been to its users—use it while you can and cherish it.

The logic lies hidden in the SemiAnalysis table. From the user's perspective, it's not a death sentence, but a list of benefits still in effect: you pay $200 per month, and the platform will cover up to $14,000 of your computing power. The last time we saw this level of subsidy was during the ride-hailing and food delivery wars—and we all remember the outcome of those two wars: after the subsidies ended, prices never recovered.

So, do the important tasks now. For example, the subscription window for Fable 5 only lasts until June 22nd. Instead of waiting for the points era to come and then carefully calculating your spending, you should schedule those long-term tasks you've always wanted to do but thought were too expensive. This isn't about taking advantage of loopholes—it's simply about being a savvy beneficiary of a pricing error that's destined to be corrected.

Turley's analogy may have conveyed a deeper meaning than he intended. The true sign that electricity has become infrastructure is not that it reaches every household, but that every household has an electricity meter installed—from that moment on, no one discusses whether electricity should be sold on a monthly basis; people only discuss electricity prices.

There will be no obituary for the subscription model. It will simply appear as a small line in your expense statement called "Admission Fee" on a quiet billing date.

Until then—use it while you can, and cherish it.