On June 9, 2026, Anthropic released Claude Fable 5, a new model focused on complex code planning and generation. The release itself wasn't surprising, but two accompanying pieces of information quickly ignited discussion in the development community: Fable 5's API pricing was twice that of its predecessor, Opus 4.8; and after June 22, Fable 5 would be removed from Pro, Max, and other subscription plans, accessible only through API calls or usage credits.
Developers on Reddit reported that running Fable 5 on a Max 20x plan consumed 2% of their credit limit per minute. A user on Hacker News recorded their daily usage, showing that they spent $82.92 on Fable 5 API tokens within their credit limit. It's powerful, but it burns through cash quickly.
This is not just a matter of pricing a model. When top-tier AI begins to be priced in tiers based on its capabilities, and the highest tier is removed from the universal subscription, a more pressing question arises: who is using the best models?
Double the price, and a countdown.
According to Anthropic's official announcement, Claude Fable 5 is the first publicly released Mythos-level model. Mythos is Anthropic's internal rating for the highest-capability model, previously only available to Project Glasswing partners. Fable 5 is essentially a "publicly released" version of Mythos 5, offering significant performance improvements over Claude Sonnet in scenarios such as precise code structure building and understanding developers' deep needs.
This capability comes with a direct cost. According to Anthropic's official pricing page, the Fable 5 API input price is $10 per million tokens, and the output price is $50. For comparison, Opus 4.8, also under Anthropic, is priced at $5 and $25 respectively, while Sonnet 4.6 is $3 and $15. Fable 5's output price is 3.3 times that of Sonnet.
Access is also limited. Anthropic confirmed that Fable 5 will be included in all subscription plans until June 22nd; starting June 23rd, Fable 5 will be removed from subscription plans, and users will need to use usage credits to access it. The official statement is that "it will be restored to standard subscription functionality when capacity allows," but no timeline was given.
Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania, published an in-depth review of Fable 5 on his blog oneusefulthing.org. He wrote, "Fable is twice the price of Opus, and the rate at which tokens are consumed indicates that production costs will be 'very high'." Mollick has long tracked the evolution of AI model capabilities, and this judgment is not referring to the pricing strategy itself, but rather to the infrastructure costs of running the model.
How fast are tokens being consumed? Reddit users report that using Fable 5 under the Claude Max 20x plan consumes approximately 2% of their credit limit per minute. This plan costs $200 per month; at this rate, the entire month's credit would be exhausted in less than an hour of continuous use. Hacker News developer Simon Willison recorded his daily API usage, consuming $82.92 of his credit limit.
Fable 5's pricing and access strategy sends a clear signal: the enhanced capabilities of the next-generation model are directly reflected in the price multiplier. The "free trial" of the subscription plan is only enough for users to establish a usage habit. When the trial expires, users face two choices: pay for the more unpredictable API costs to continue using the service, or revert to the previous tier of the model.
Tiered from $9 to $120
Fable 5's pricing is not an isolated case. Looking at the pricing of currently available mainstream model APIs, a steep price spectrum is clearly visible.
According to Google's official pricing page, the paid tier output price for Gemini 3.5 Flash is $9 per million tokens. OpenAI's official pricing page shows that GPT-5.4 outputs are $15, GPT-5.5 is $30, and GPT-5.5 Pro is $120. Anthropic's Sonnet 4.6 output is $15, Opus 4.8 is $25, and Fable 5 is $50.
From $9 for Gemini 3.5 Flash to $120 for GPT-5.5 Pro, the price difference exceeds 13 times. This is no longer a simple "high-end vs. low-end" dichotomy. Manufacturers are actively building a three-tiered structure: the bottom layer is an extremely low-priced entry-level model, responsible for popularization and customer acquisition; the middle layer is a mid-priced high-performance model, catering to daily development needs; and the top layer is an expensive but most powerful inference model, targeting users in high-frequency, high-value scenarios.
Behind the tiered pricing is the explicit representation of capability levels. Anthropic uses four levels—Sonnet, Opus, Fable, and Mythos—to categorize model capabilities; OpenAI uses Standard and Pro versions to differentiate the various specifications of its GPT-5 series; and Google uses Flash and Pro to distinguish the positioning of its Gemini 3 series. These levels are no longer just internal designations but are directly reflected in the price.
An even more noteworthy change is regarding access rights. In the model list published on Anthropic's official pricing page, Fable 5 is marked as "included in subscription plans until June 22nd," while Opus 4.8 and Sonnet 4.6 do not have similar time restrictions. Anthropic is experimenting with a new tiered approach: instead of assigning different models to different subscription tiers based on capability, it is directly separating the top-level models from the subscription system and establishing a separate API call channel.
Anxiety about uncontrollable bills
At $50 per million tokens, the price isn't particularly high in enterprise procurement scenarios. Claude Fable's context window holds up to 1 million tokens, with a maximum output of 128,000 tokens; a single complex task could consume tens to hundreds of thousands of tokens. As a reasoning model, Fable 5's ability to "think longer and generate more tokens" is not a design flaw, but rather its strength. The model performs multiple steps of reasoning before outputting the final answer, each step incurring token costs.
The problem is that users have no control over this consumption. One developer using the Max 20x plan was burning through 2% of their credit limit per minute, and Simon Willison consumed $82.92 per day. These consumptions occurred during the "trial period" and were within the plan's credit limit. After June 22, similar usage will be directly converted into API bills or credits.
As a unit of account, the consumption of tokens is determined by the model designers, and users have no control over the "usage." Comparing AI queries to electricity consumption, the core contradiction lies in the fact that users cannot choose a "power-saving mode" nor predict how much "power" the next query will consume. An industry discussion article published on LinkedIn summarized this phenomenon as the core characteristic of the "AI tax": "The real AI tax is not just the price of the model, but its unpredictability."
This unpredictability impacts individual developers far more than businesses. Businesses can sign bulk agreements, set budget caps, and share costs within their teams. For individual users facing pay-as-you-go APIs, a single serious debugging session could cost the entire month's subscription fee. A Hacker News user commented, "Cost-conscious routing has gone from a nice-to-have to a mandatory requirement." "Cost-conscious routing" refers to using the cheaper model for default tasks and only calling the more expensive model when necessary. Before Fable 5, this was more of an optimization strategy; Fable 5's price and consumption rate have turned this strategy into a rigid configuration where not using it risks overspending.
Mainstream APIs offer calling interfaces for different models, allowing developers to define their own routing logic. However, this requires users to have programming skills, understand model differences, and be willing to accept the potential quality loss caused by model degradation. Each additional hurdle keeps more people out.
A $200 monthly fee can't buy you a full pass.
The tiered access system is also changing. Claude Fable 5's strategy is: before June 22, all paid users can use it; after June 22, even Max 20x users who pay the maximum monthly fee of $200 cannot access it directly and need to call the API through usage credits.
In contrast, OpenAI's access strategy takes a different approach. According to the official ChatGPT pricing page, ChatGPT Pro offers two tiers: $100 and $200, both providing access to GPT-5 Pro. Higher-tier model capabilities correspond to higher-tier subscription levels, but the subscription itself remains a complete access package.
The difference between the two strategies goes beyond just price. OpenAI's model sets the barrier on subscription fees: those who can afford a $200 monthly fee can use the best models. Anthropic's strategy for Fable 5 sets barriers in two places: cost (pay-as-you-go API) and technical barriers (requiring API access capabilities). In user discussions on Hacker News, some have called this strategy a "free sample drug strategy, then raises the price once you're addicted," while others believe it's more likely a genuine issue with the computing power supply chain, with Anthropic currently unable to cover Fable 5's inference costs with a fixed-price subscription model.
Regardless of the motivation, the effect is clear: a subscription has become an "admission ticket," not an "all-inclusive pass." Top-tier model capabilities are not included in the ticket price. TechCrunch, in its coverage of the Fable 5 release, noted that Fable 5 is "the first publicly released Mythos-level model." Before Fable 5, Mythos-level models were exclusively available to Project Glasswing partners. Now the barrier to entry has been lowered, but it hasn't disappeared.
Users taking detours
The tiered access system has already resulted in noticeable changes in user behavior. Some users have begun to look for ways to bypass official channels.
One approach involves using third-party aggregation services to invoke the model. These "intermediaries" offer tokens at prices lower than the official list price, typically sourced from idle quotas purchased in bulk by enterprises, arbitrage opportunities arising from price differences across different regions, or undisclosed channels. While the price is lower, privacy protection and stability are not guaranteed. Related discussions continue to escalate on platforms like Zhihu, with users' real concern not being whether "cheap tokens are usable," but rather "who will handle the data."
Another approach is to switch to open-source or lower-cost alternatives. Some developers have shared tutorials demonstrating how to integrate models like DeepSeek into various development tools, bypassing official pricing and verification processes. While this may result in some loss of capabilities and increased privacy risks, it significantly improves cost control.
In developer discussions on Hacker News and Reddit, the hybrid approach is frequently mentioned: "Use the cheaper model by default, only switching to Fable in critical steps." This sounds like a reasonable resource optimization. Looking back at the discussions about "AI democratization" two or three years ago, the mainstream narrative was that everyone should have equal access to the best models. Now, "using the best models" has become something that requires careful calculation.
Local deployment offers another perspective. A GPU capable of smoothly running large models is expensive, and a complete system capable of running 120-bit models is beyond the reach of most individual developers. The barrier to local deployment is another form of paywall, only this time it uses hardware procurement instead of pay-as-you-go billing.
These detours are not "smart user money-saving tricks." When a large number of users actively seek alternatives, it's because the original paths are narrowing. Privacy risks at transit points, gaps in the capabilities of open-source models, and hardware investments for local deployments—every detour comes at a cost.
The steps are already under my feet.
If we compare AI to public utility infrastructure like water and electricity, then the first principle of public utilities is universal service and equitable access. However, the pricing trend for AI models is moving in the opposite direction. The more powerful the model, the higher the price; top-tier models are moving away from universal subscriptions, and token-based billing makes costs unpredictable. While electricity supply doesn't simply provide "stronger current" to users who pay more, AI is doing just that.
This is not a pricing debate about "expensive or cheap." When top-tier models like Claude Fable are removed from mass subscriptions, when the output price of GPT-5.5 Pro reaches 13 times that of Flash models, and when an individual developer can burn through $82 a day, "unaffordability" is transforming from a price issue into a structural one.
In the field of AI tool usage, a clear hierarchy is emerging. At the top are enterprise users who can afford unlimited API calls and dedicated hardware; in the middle are individual developers who use top-tier models with careful budgeting; and at the bottom are ordinary users who can only use free or low-cost models. Each level of this hierarchy is not defined by technical skill, but by purchasing power and access to technology.
The June 22 deadline for Anthropic is just the latest step on this ladder, not the first, nor will it be the last.


