Luka Mrkić
Head of BD
Insights, strategies, and real-world playbooks on AI-powered marketing.
JUN 11, 2026
If you are evaluating who should build a cost-managed Fable 5 rollout for your team, this guide gives you both the technical blueprint and the standards to evaluate the work.
Anthropic priced Claude Fable 5 at $10 per million input tokens and $50 per million output tokens. The Mythos 5 SKU, which is the same underlying model with cyber safeguards lifted for Project Glasswing partners, ships at the same list price. Anthropic describes the launch price as less than half the price of Claude Mythos Preview, the April model that introduced the Mythos class to a small group of trusted partners.
Pricing applies uniformly across the Claude API, the Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry. The model ID for developers is claude-fable-5. The context window is 1 million tokens with up to 128k output tokens per request, and the per-token price does not change across that window.
Two structural facts shape every bill on Fable 5. First, adaptive thinking is always on and cannot be disabled, and thinking tokens are billed as output tokens. The only lever for cost and latency on a Fable call is the effort setting (low, medium, high). Second, Fable 5 and Mythos 5 are Covered Models with mandatory 30-day data retention. Workflows that previously ran with zero data retention need to stay on Opus 4.8 or earlier classes, which changes the cost profile for some regulated stacks.
Most teams running Claude in production today price their stack from a single SKU. A spreadsheet of input and output cost per workflow on Opus 4.8 or Sonnet 4.6 covers the whole bill. Fable 5 changes that on three axes at once: the price doubles, thinking is now an unstoppable cost line, and the fallback to Opus 4.8 on about 5 percent of sessions means a Fable workflow’s effective cost is a blend of two SKUs.
A real Fable 5 cost model has three terms. Input cost plus thinking cost plus output cost, with a 5 percent share allocated to Opus 4.8 fallback on workflows that touch the classifier categories. Without that shape, the bill at the end of the month does not match the spreadsheet.
If you are interested in building AI agents and automation like this for your team, book a call here.

A useful baseline for everyday agent work is 10k input tokens and 4k output tokens per call. On Fable 5 list price, that call costs 10 cents on input and 20 cents on output, for a per-call cost of 30 cents. The same shape on Opus 4.8 standard mode lists for 15 cents, half of Fable. The same shape on Sonnet 4.6 lists for 9 cents, less than a third of Fable.
Long-context calls move the math toward input. A workflow that loads a 200k-token codebase plus 5k tokens of instructions and emits 8k tokens of output costs $2.05 on input and 40 cents on output, for $2.45 per call. The output share drops below 20 percent. Routing decisions that target long-context work need to weight input pricing more heavily than headline output pricing suggests.
Long-horizon agent runs flip the math. A multi-hour autonomous run that emits 200k output tokens (including thinking) over the course of the session, with 30k input tokens of context loaded along the way, costs 30 cents on input and $10 on output, for $10.30 per session. This is where Fable’s price stings if the workflow does not need the model’s long-horizon strength, and where it pays off if it does.
Claude Fable 5 is priced at $10 / $50 per million tokens. Claude Opus 4.8 is priced at $5 / $25 in standard mode and $10 / $50 in fast mode, which runs at 2.5 times the speed. Claude Sonnet 4.6 is priced at $3 / $15. Claude Haiku 4.5 is priced at $1 / $5. The four-tier pricing surface gives a team running all four SKUs a 10x cost range from Haiku to Fable on output.
Two comparisons matter for routing. Fable 5 versus Opus 4.8 fast mode is the same per-token price. The decision between them is about capability, and the right call depends on whether the workflow needs Fable’s long-horizon lead or whether Opus 4.8 fast mode covers it. Fable 5 versus Opus 4.8 standard mode is exactly twice the price, and the decision is whether the lift on long, complex work justifies that delta.
Fable 5 versus Sonnet 4.6 is a more than 3x price gap on both input and output. The right routing rule is to default most workloads to Sonnet 4.6 or Opus 4.8 and only promote a workflow to Fable after a side-by-side eval shows a clear capability lift that is worth the spend.
Three Anthropic features cut the effective per-call price below the headline. Prompt caching is the first lever. Cached read tokens are billed at 10 percent of the input price. A stable system prompt with tool definitions and few-shot examples that spans 5k tokens, reused across a session of 100 calls, costs 5 cents on the first call and half a cent on each of the next 99 instead of $5 across all 100. Caching is the single most impactful lever for chat surfaces and agent loops.
The Batch API is the second lever. Eligible offline workloads run through the Message Batches API at 50 percent off both input and output. Overnight summarization runs, weekly reporting jobs, large-scale enrichment, and SEO content generation are typical batch wins. Anthropic ships Batch on Fable 5 from launch.
The third lever is workflow-keyed routing. The cheapest dollar saved on Fable 5 is the one that never hits Fable. A router that classifies incoming work and sends the long tail of complex-but-routine tasks to Opus 4.8 or Sonnet 4.6 keeps Fable’s spend concentrated on the workloads where the lift justifies it. The combination of caching, batching, and routing routinely reduces effective Fable spend by 40 to 70 percent versus naive list-price math.

Anthropic published a phased rollout for Fable 5 on subscription plans. From June 9 through June 22, 2026, Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans. The included window exists to spread demand under a load Anthropic explicitly calls difficult to predict.
On June 23, Fable 5 leaves the included subscription scope. Subscription users who want to keep calling Fable need to add usage credits. Anthropic states that the window may be extended if capacity allows, and that the long-term intent is to restore Fable as a standard subscription inclusion once capacity permits. The transitional period is the right time for teams to instrument cost per workflow on Fable so the move to credit-backed usage is a calm migration.
On the Claude API and consumption-based Enterprise plans, Fable 5 is fully available from June 9 at the published $10 / $50 list price. The rollout window only affects subscription plans.
Adaptive thinking is always on for Fable 5 and cannot be turned off. Thinking tokens are billed as output tokens at $50 per million. The amount of thinking a request produces scales with the effort setting (low, medium, high) and with the complexity of the task. A low-effort question can finish with negligible thinking. A high-effort planning step on a long-horizon agent run can emit 20k to 50k thinking tokens before the final answer, and those tokens land in the bill at the output rate.
Two operational decisions follow. First, cap effort per workflow. A chat surface that defaults to high effort because nobody set the parameter ships a 3x to 5x bill compared to the same surface at medium. Second, set thinking display to summarized for human-facing surfaces. Fable 5 never returns raw chain of thought; summarized thinking is the readable form. Pass thinking blocks back unchanged in multi-turn conversations on the same model so the model can build on its prior reasoning.

Four metrics belong on a dashboard the day Fable 5 joins your stack. Per-workflow blended cost split by SKU tells you where Fable is paying for itself and where it is overspending. Cached read ratio tells you whether your high-frequency surfaces are caching system prompts as expected. Effort distribution per workflow tells you whether any surface is defaulting to high effort and burning thinking tokens. Fallback rate to Opus 4.8 tells you whether the classifier categories are firing where you expect or whether a prompt change has nudged them.
Pair these with a monthly cost review. Compare the blended per-workflow cost on Fable 5 against the alternative SKUs (Opus 4.8 standard, Opus 4.8 fast, Sonnet 4.6) on the same eval set. If a workflow on Fable is producing eval scores within five percent of Opus 4.8 on the same task, demote it back to Opus and pocket the savings. If a workflow on Opus is borderline on quality, run the side-by-side and promote to Fable if the lift is worth the spend.
A useful rubric for whether a workflow earns the Fable 5 list price: route to Fable 5 when the task would be assigned to a senior engineer, a senior analyst, or a research scientist on your team. Route to Opus 4.8 standard mode when the task is the kind of work an intermediate teammate could complete in under thirty minutes. Route to Opus 4.8 fast mode when latency is the deciding constraint and the cost matches Fable. Route to Sonnet 4.6 when volume and latency dominate. Route to Haiku 4.5 for high-volume product features that fit in a tight context.
The token cost of getting the routing wrong on the cheap side is small. The time cost of an under-powered model on a long-horizon task is large. Default the router to the cheapest SKU that passes your eval, and only promote a workflow to Fable after a side-by-side run shows the lift is worth twice the per-token cost.
Claude Fable 5 is priced at $10 per million input tokens and $50 per million output tokens. The same list price applies to Mythos 5 on first-party and third-party surfaces.
Fable 5 is exactly twice the list price of Opus 4.8 standard mode ($5 / $25) and matches Opus 4.8 fast mode ($10 / $50), which runs 2.5 times faster than standard. The right comparator depends on whether the workflow runs on standard or fast Opus.
Sonnet 4.6 is $3 / $15 per million tokens and Haiku 4.5 is $1 / $5 per million tokens. Fable 5 is roughly 3.3x the price of Sonnet 4.6 and 10x the price of Haiku 4.5 on output, with similar multiples on input.
Thinking tokens are billed as output tokens at $50 per million. Adaptive thinking is always on for Fable 5 and cannot be disabled, so effort is the only lever for thinking spend.
Yes. Cached read tokens are billed at 10 percent of the input price, with the same five-minute lifetime as previous Claude models. Caching is the highest-impact cost lever for chat surfaces and agent loops with a stable system prompt.
Yes. Eligible offline workloads run through the Message Batches API at 50 percent off both input and output. Overnight summarization, weekly reporting, large-scale enrichment, and SEO content generation are typical batch wins.
From June 9 through June 22, Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans. From June 23, subscription usage requires credits until Anthropic restores Fable as a standard plan inclusion. The Claude API and consumption-based Enterprise plans are not affected by the window.
No. Fable 5 and Mythos 5 are Covered Models with mandatory 30-day data retention. Workflows that require zero data retention need to stay on Opus 4.8 or earlier classes.
When Fable 5’s classifiers fall back to Opus 4.8 on cyber, biology and chemistry, or distillation queries, the response is generated by Opus 4.8 and billed at the Opus 4.8 rate. Around 5 percent of Fable sessions fall back today. Instrument the path so per-workflow cost is attributed correctly.
If you want a cost-managed Fable 5 rollout designed and shipped cleanly inside your AI engineering org with routing, caching, batching, effort caps, fallback handling, and observability built in, let’s talk.