back to articles
Luka Mrkić

Luka Mrkić

Head of BD

Claude Fable 5 Pricing Explained: $10/$50 per Million Tokens in Practice

Claude Fable 5 Pricing Explained: $10/$50 per Million Tokens in Practice

TL;DR

  • Claude Fable 5 is priced at $10 per million input tokens and $50 per million output tokens. The same prices apply to Mythos 5 on first-party and third-party surfaces.
  • Fable 5 is exactly twice the list price of Opus 4.8 standard mode ($5 / $25) and matches the list price of Opus 4.8 fast mode ($10 / $50). For latency-sensitive workloads on Opus 4.8 fast mode, the per-token cost is identical.
  • Adaptive thinking is always on for Fable 5 and is billed as output tokens at $50 per million. Effort is the only lever for cost and latency, so capping effort per workflow is the highest-leverage cost decision.
  • Three cost levers cut effective price below the headline: prompt caching at 10 percent of input price on cached reads, the Batch API at 50 percent off both sides for offline workloads, and a workflow-keyed router that keeps cheap calls on Opus 4.8 or Sonnet 4.6.
  • Subscription plans include Fable 5 at no extra cost from June 9 through June 22, 2026. From June 23, subscription usage requires credits until Anthropic restores the model as a standard plan inclusion.

If you are evaluating who should build a cost-managed Fable 5 rollout for your team, this guide gives you both the technical blueprint and the standards to evaluate the work.

What Claude Fable 5 actually costs

Anthropic priced Claude Fable 5 at $10 per million input tokens and $50 per million output tokens. The Mythos 5 SKU, which is the same underlying model with cyber safeguards lifted for Project Glasswing partners, ships at the same list price. Anthropic describes the launch price as less than half the price of Claude Mythos Preview, the April model that introduced the Mythos class to a small group of trusted partners.

Pricing applies uniformly across the Claude API, the Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry. The model ID for developers is claude-fable-5. The context window is 1 million tokens with up to 128k output tokens per request, and the per-token price does not change across that window.

Two structural facts shape every bill on Fable 5. First, adaptive thinking is always on and cannot be disabled, and thinking tokens are billed as output tokens. The only lever for cost and latency on a Fable call is the effort setting (low, medium, high). Second, Fable 5 and Mythos 5 are Covered Models with mandatory 30-day data retention. Workflows that previously ran with zero data retention need to stay on Opus 4.8 or earlier classes, which changes the cost profile for some regulated stacks.

Why Fable 5 pricing needs its own playbook

Most teams running Claude in production today price their stack from a single SKU. A spreadsheet of input and output cost per workflow on Opus 4.8 or Sonnet 4.6 covers the whole bill. Fable 5 changes that on three axes at once: the price doubles, thinking is now an unstoppable cost line, and the fallback to Opus 4.8 on about 5 percent of sessions means a Fable workflow’s effective cost is a blend of two SKUs.

A real Fable 5 cost model has three terms. Input cost plus thinking cost plus output cost, with a 5 percent share allocated to Opus 4.8 fallback on workflows that touch the classifier categories. Without that shape, the bill at the end of the month does not match the spreadsheet.

If you are interested in building AI agents and automation like this for your team, book a call here.

Espressio diagram

How $10 input and $50 output translate to per-call cost

A useful baseline for everyday agent work is 10k input tokens and 4k output tokens per call. On Fable 5 list price, that call costs 10 cents on input and 20 cents on output, for a per-call cost of 30 cents. The same shape on Opus 4.8 standard mode lists for 15 cents, half of Fable. The same shape on Sonnet 4.6 lists for 9 cents, less than a third of Fable.

Long-context calls move the math toward input. A workflow that loads a 200k-token codebase plus 5k tokens of instructions and emits 8k tokens of output costs $2.05 on input and 40 cents on output, for $2.45 per call. The output share drops below 20 percent. Routing decisions that target long-context work need to weight input pricing more heavily than headline output pricing suggests.

Long-horizon agent runs flip the math. A multi-hour autonomous run that emits 200k output tokens (including thinking) over the course of the session, with 30k input tokens of context loaded along the way, costs 30 cents on input and $10 on output, for $10.30 per session. This is where Fable’s price stings if the workflow does not need the model’s long-horizon strength, and where it pays off if it does.

Pricing compared: Fable 5, Opus 4.8, Sonnet 4.6, Haiku 4.5

Claude Fable 5 is priced at $10 / $50 per million tokens. Claude Opus 4.8 is priced at $5 / $25 in standard mode and $10 / $50 in fast mode, which runs at 2.5 times the speed. Claude Sonnet 4.6 is priced at $3 / $15. Claude Haiku 4.5 is priced at $1 / $5. The four-tier pricing surface gives a team running all four SKUs a 10x cost range from Haiku to Fable on output.

Two comparisons matter for routing. Fable 5 versus Opus 4.8 fast mode is the same per-token price. The decision between them is about capability, and the right call depends on whether the workflow needs Fable’s long-horizon lead or whether Opus 4.8 fast mode covers it. Fable 5 versus Opus 4.8 standard mode is exactly twice the price, and the decision is whether the lift on long, complex work justifies that delta.

Fable 5 versus Sonnet 4.6 is a more than 3x price gap on both input and output. The right routing rule is to default most workloads to Sonnet 4.6 or Opus 4.8 and only promote a workflow to Fable after a side-by-side eval shows a clear capability lift that is worth the spend.

The three cost levers that change effective price

Three Anthropic features cut the effective per-call price below the headline. Prompt caching is the first lever. Cached read tokens are billed at 10 percent of the input price. A stable system prompt with tool definitions and few-shot examples that spans 5k tokens, reused across a session of 100 calls, costs 5 cents on the first call and half a cent on each of the next 99 instead of $5 across all 100. Caching is the single most impactful lever for chat surfaces and agent loops.

The Batch API is the second lever. Eligible offline workloads run through the Message Batches API at 50 percent off both input and output. Overnight summarization runs, weekly reporting jobs, large-scale enrichment, and SEO content generation are typical batch wins. Anthropic ships Batch on Fable 5 from launch.

The third lever is workflow-keyed routing. The cheapest dollar saved on Fable 5 is the one that never hits Fable. A router that classifies incoming work and sends the long tail of complex-but-routine tasks to Opus 4.8 or Sonnet 4.6 keeps Fable’s spend concentrated on the workloads where the lift justifies it. The combination of caching, batching, and routing routinely reduces effective Fable spend by 40 to 70 percent versus naive list-price math.

Espressio diagram

Subscription pricing and the June 9 to June 22 window

Anthropic published a phased rollout for Fable 5 on subscription plans. From June 9 through June 22, 2026, Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans. The included window exists to spread demand under a load Anthropic explicitly calls difficult to predict.

On June 23, Fable 5 leaves the included subscription scope. Subscription users who want to keep calling Fable need to add usage credits. Anthropic states that the window may be extended if capacity allows, and that the long-term intent is to restore Fable as a standard subscription inclusion once capacity permits. The transitional period is the right time for teams to instrument cost per workflow on Fable so the move to credit-backed usage is a calm migration.

On the Claude API and consumption-based Enterprise plans, Fable 5 is fully available from June 9 at the published $10 / $50 list price. The rollout window only affects subscription plans.

How adaptive thinking changes the cost equation

Adaptive thinking is always on for Fable 5 and cannot be turned off. Thinking tokens are billed as output tokens at $50 per million. The amount of thinking a request produces scales with the effort setting (low, medium, high) and with the complexity of the task. A low-effort question can finish with negligible thinking. A high-effort planning step on a long-horizon agent run can emit 20k to 50k thinking tokens before the final answer, and those tokens land in the bill at the output rate.

Two operational decisions follow. First, cap effort per workflow. A chat surface that defaults to high effort because nobody set the parameter ships a 3x to 5x bill compared to the same surface at medium. Second, set thinking display to summarized for human-facing surfaces. Fable 5 never returns raw chain of thought; summarized thinking is the readable form. Pass thinking blocks back unchanged in multi-turn conversations on the same model so the model can build on its prior reasoning.

How adaptive thinking changes the cost equation

Common mistakes when modeling Fable 5 cost

  • Comparing Fable 5 list price to Opus 4.8 standard mode list price for a latency-sensitive workflow. The right comparator is Opus 4.8 fast mode, which lists for $10 / $50 and runs 2.5 times faster than standard.
  • Ignoring thinking cost. Adaptive thinking is always on and billed as output. A workflow with maximum effort can spend 3x to 5x more than the same workflow at low effort.
  • Modeling 100 percent of Fable calls as Fable cost. Around 5 percent of sessions fall back to Opus 4.8 on the classifier categories, so the effective cost is a blend of the two SKUs on those workflows.
  • Skipping prompt caching for high-frequency surfaces. Cached reads at 10 percent of input price are the single most impactful lever for chat and agent loops with a stable system prompt.
  • Skipping the Batch API for offline workloads. A 50 percent discount on both input and output is available out of the box for eligible workloads on the Message Batches API.
  • Routing everything to Fable 5 because it is the strongest model. The dominant failure mode in early rollouts is overspending on cheap calls that Opus 4.8 or Sonnet 4.6 would have handled. Default to the cheaper SKU and promote on eval evidence.
  • Forgetting the June 23 subscription cutover. Pro, Max, Team, and seat-based Enterprise plans lose included Fable 5 access on June 23 and switch to credit-backed usage.

How to know your Fable 5 spend is working

Four metrics belong on a dashboard the day Fable 5 joins your stack. Per-workflow blended cost split by SKU tells you where Fable is paying for itself and where it is overspending. Cached read ratio tells you whether your high-frequency surfaces are caching system prompts as expected. Effort distribution per workflow tells you whether any surface is defaulting to high effort and burning thinking tokens. Fallback rate to Opus 4.8 tells you whether the classifier categories are firing where you expect or whether a prompt change has nudged them.

Pair these with a monthly cost review. Compare the blended per-workflow cost on Fable 5 against the alternative SKUs (Opus 4.8 standard, Opus 4.8 fast, Sonnet 4.6) on the same eval set. If a workflow on Fable is producing eval scores within five percent of Opus 4.8 on the same task, demote it back to Opus and pocket the savings. If a workflow on Opus is borderline on quality, run the side-by-side and promote to Fable if the lift is worth the spend.

A decision rubric for the $10 / $50 price

A useful rubric for whether a workflow earns the Fable 5 list price: route to Fable 5 when the task would be assigned to a senior engineer, a senior analyst, or a research scientist on your team. Route to Opus 4.8 standard mode when the task is the kind of work an intermediate teammate could complete in under thirty minutes. Route to Opus 4.8 fast mode when latency is the deciding constraint and the cost matches Fable. Route to Sonnet 4.6 when volume and latency dominate. Route to Haiku 4.5 for high-volume product features that fit in a tight context.

The token cost of getting the routing wrong on the cheap side is small. The time cost of an under-powered model on a long-horizon task is large. Default the router to the cheapest SKU that passes your eval, and only promote a workflow to Fable after a side-by-side run shows the lift is worth twice the per-token cost.

FAQ

How much does Claude Fable 5 cost per million tokens?

Claude Fable 5 is priced at $10 per million input tokens and $50 per million output tokens. The same list price applies to Mythos 5 on first-party and third-party surfaces.

Is Fable 5 more expensive than Claude Opus 4.8?

Fable 5 is exactly twice the list price of Opus 4.8 standard mode ($5 / $25) and matches Opus 4.8 fast mode ($10 / $50), which runs 2.5 times faster than standard. The right comparator depends on whether the workflow runs on standard or fast Opus.

How does Fable 5 compare to Sonnet 4.6 and Haiku 4.5 on price?

Sonnet 4.6 is $3 / $15 per million tokens and Haiku 4.5 is $1 / $5 per million tokens. Fable 5 is roughly 3.3x the price of Sonnet 4.6 and 10x the price of Haiku 4.5 on output, with similar multiples on input.

Are thinking tokens billed as input or output on Fable 5?

Thinking tokens are billed as output tokens at $50 per million. Adaptive thinking is always on for Fable 5 and cannot be disabled, so effort is the only lever for thinking spend.

Does Fable 5 support prompt caching?

Yes. Cached read tokens are billed at 10 percent of the input price, with the same five-minute lifetime as previous Claude models. Caching is the highest-impact cost lever for chat surfaces and agent loops with a stable system prompt.

Is the Batch API available for Fable 5?

Yes. Eligible offline workloads run through the Message Batches API at 50 percent off both input and output. Overnight summarization, weekly reporting, large-scale enrichment, and SEO content generation are typical batch wins.

What happens to Fable 5 on subscription plans after June 22, 2026?

From June 9 through June 22, Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans. From June 23, subscription usage requires credits until Anthropic restores Fable as a standard plan inclusion. The Claude API and consumption-based Enterprise plans are not affected by the window.

Can I use Fable 5 with zero data retention?

No. Fable 5 and Mythos 5 are Covered Models with mandatory 30-day data retention. Workflows that require zero data retention need to stay on Opus 4.8 or earlier classes.

Does fallback to Opus 4.8 affect billing?

When Fable 5’s classifiers fall back to Opus 4.8 on cyber, biology and chemistry, or distillation queries, the response is generated by Opus 4.8 and billed at the Opus 4.8 rate. Around 5 percent of Fable sessions fall back today. Instrument the path so per-workflow cost is attributed correctly.

What to do next

  • Build a per-workflow cost model with three lines: input at $10 per 1M, thinking and output at $50 per 1M, and a 5 percent fallback share at Opus 4.8 rates on classifier-touching workflows.
  • Cap effort per workflow before sending the first production traffic to Fable 5. Defaulting to high effort burns thinking tokens on workloads that do not need it.
  • Enable prompt caching on every surface with a stable system prompt longer than 1k tokens. The 90 percent discount on cached reads pays for itself in the first week.
  • Route eligible offline workloads through the Message Batches API for the 50 percent discount on both sides.
  • Set a calendar reminder for June 22, 2026 to review subscription usage before Fable 5 moves to credit-backed access on June 23.
  • Stand up the four-metric dashboard (blended cost by SKU, cached read ratio, effort distribution, fallback rate) before broader rollout.

If you want a cost-managed Fable 5 rollout designed and shipped cleanly inside your AI engineering org with routing, caching, batching, effort caps, fallback handling, and observability built in, let’s talk.