AI Intelligence

Anthropic Ships Mythos to the Public at $50 per Million Tokens

Vijay Bhagwati

10 Jun 2026 • 8 min read

June 10, 2026 | Issue #185 | 8 min read

Anthropic released Claude Fable 5 to the public on Tuesday — the first Mythos-class model available outside restricted cybersecurity partnerships. It is priced at $10 per million input tokens and $50 per million output tokens, making it the most expensive frontier API on the market and roughly half the cost of the prior Mythos Preview tier. Anthropic also shipped Claude Mythos 5 to approved Project Glasswing customers, with fewer safeguards but the same pricing.

Fable 5 routes requests involving cybersecurity, biology and chemistry, and model distillation to Claude Opus 4.8 instead, notifying the user when the fallback triggers. Anthropic says fewer than 5% of sessions trigger this fallback. The safeguard is server-side: developers cannot override it through the API. That architecture is deliberate. Anthropic is attempting to separate capability from permission at the infrastructure layer rather than the policy layer.

The benchmarks place Fable 5 at or near the top on software engineering, knowledge work, vision, and scientific reasoning. Andrej Karpathy called it "a major-version-bump-deserving step change forward" on long problem-solving sessions. Mike Krieger, Anthropic's head of product, said it is "the first model I hand off whole projects to." The model runs on AWS Bedrock, Google Vertex AI, and Microsoft Foundry alongside Anthropic's own API, which means the distribution is already everywhere the enterprise buyers are.

The question is whether buyers will pay. At $50 per million output tokens, Fable 5 costs 100 times more than DeepSeek V4 Pro and 12 times more than GPT-4.1. For long-running coding tasks or agentic workflows that consume millions of tokens, the bill adds up fast. Anthropic's bet is that enterprise customers will treat the premium as insurance against both capability ceilings and safety liability. The next few weeks of usage data will test that hypothesis.

What Fable 5 Costs and What It Blocks

Anthropic's pricing table for Fable 5 and Mythos 5 is straightforward: $10 in, $50 out, $1 for cache hits, 1M context window, 128K max output. The model ID is claude-fable-5. Adaptive thinking is always on. Raw reasoning traces are hidden by default; developers can request summarized thinking.

The restrictions are what matter. Anthropic explicitly blocks using Fable 5 to "accelerate ML and LLM development" — training pipelines, pretraining, distributed infrastructure, and accelerator design. Alex Volkov of Tonic AI flagged this as "sandbagging everyone else" by preventing competitors from distilling or benchmarking against the model's weights. The restriction is contractual, not technical, but it signals Anthropic's view that its model weights are a strategic asset to be rented, not a public good to be studied.

The 30-day data retention policy for Mythos-class models also drew attention. Anthropic says it needs the window for safety investigations. For customers handling regulated data, that policy is a procurement obstacle. The tension between Anthropic's safety mission and its enterprise sales motion is now visible in the pricing sheet.

Source: Anthropic, VentureBeat, AI Pricing Guru

Microsoft Calls Anthropic's Safety Posture "Dangerous"

Mustafa Suleyman, CEO of Microsoft AI, told The Verge's Decoder podcast that Anthropic's Claude Constitution "speculates about its consciousness and whether it has those feelings and is aware." He called it "really, really dangerous" and a "philosophical failing," adding that "we want AIs to be controllable, contained, accountable, aligned tools that serve humanity."

The comment is unusually direct. Suleyman did not name Anthropic in every sentence, but the reference is unmistakable. Microsoft and Anthropic are partners — Microsoft invested in Anthropic and distributes its models — but the partnership is showing strain as both companies compete for the same enterprise customers. Microsoft's position is that treating AI as potentially conscious undermines user control and invites regulatory overreach. Anthropic's position, articulated in its constitutional training methodology, is that modeling values explicitly produces more robust alignment than pretending the machine is just a tool.

The split is becoming a market differentiator. Enterprises evaluating vendors now face a genuine philosophical choice, not just a capability comparison. Suleyman's intervention suggests Microsoft sees that choice as a vulnerability in Anthropic's sales pitch.

Source: The Verge

NVIDIA Puts Blackwell Inside the PC

NVIDIA unveiled the RTX Spark superchip at Computex 2026 — an ARM-based CPU paired with a Blackwell GPU and 128GB of unified memory, designed for laptops and desktops. The company is positioning the platform as an "agentic AI OS" for Windows, meaning local inference for coding agents, personal assistants, and creative workflows without cloud round-trips.

The move extends NVIDIA's strategy from data centers to the edge. If developers can run frontier-class models locally, the cloud API pricing dynamics shift. Anthropic's $50 per million tokens looks different when the alternative is a one-time hardware purchase. NVIDIA has not announced pricing or availability dates for RTX Spark systems, but the spec sheet suggests performance parity with current-generation cloud inference for models up to roughly 70B parameters.

Source: BBC, Tom's Hardware

Capital Flows: Anthropic's Pricing Is the Funding Round

No new venture rounds were announced in the past 24 hours, but Anthropic's pricing announcement functions as a capital event. At $50 per million output tokens, Fable 5 generates roughly 6-10x the revenue per token of Claude Sonnet 4.5. If usage follows the typical pattern where new flagship models capture 15-25% of total API volume within 90 days, Anthropic's annualized revenue run rate could shift by hundreds of millions of dollars without a single new customer.

The pricing also sets a ceiling for the rest of the market. OpenAI, Google, and xAI now have a reference point: $50 is what Anthropic thinks the top tier is worth. Competitors can undercut, match, or ignore it, but they cannot pretend the number does not exist. For enterprises, the spread between Fable 5 at $50 and DeepSeek V4 Pro at $0.87 creates a three-tier market: premium frontier, mid-range reliable, and commodity inference. Procurement teams will build rubrics around those tiers by the end of the quarter.

From the Lab: Agents Outperform Humans on Biosecurity Tasks

A paper accepted to ICML 2026 introduces ABC-Bench, a benchmark for measuring LLM agents on biosecurity-relevant tasks. The tasks include writing code to operate liquid handling robots, designing DNA fragments for in vitro assembly, and evading DNA synthesis screening. All tested LLM agents outperformed the median human expert baseline on every task. In wet-lab validation, OpenAI's o4-mini-high produced scripts that successfully assembled DNA with expected sequences when run on an OpenTrons robot.

The researchers emphasize that the benchmark is meant to measure dual-use risk, not celebrate capability. An agent that can design DNA fragments and evade synthesis screening is, by definition, a proliferation concern. The fact that current frontier models already exceed median human performance on these tasks suggests the biosecurity community needs evaluation frameworks that move faster than the models.

Other notable papers from Tuesday: Target-SFT reframes supervised fine-tuning as target distribution design and outperforms standard SFT across ten reasoning settings. EEVEE introduces test-time prompt learning for heterogeneous real-world task streams, improving multi-benchmark scores by up to 48% over prior methods. ReasonAlloc applies hierarchical KV cache budget allocation to reasoning models, cutting memory use at small budgets without accuracy loss.

Source: ABC-Bench (arXiv), Target-SFT (arXiv), EEVEE (arXiv), ReasonAlloc (arXiv)

Eastern Front: China Trains Robots One Folded Shirt at a Time

Chinese tech companies are mobilizing populations to generate robotics training data in real households, factories, and retail shops. JD.com is working with the Suqian city government to produce 10 million hours of egocentric video data over the next two years, recruiting 100,000 employees and 500,000 external workers to film themselves doing chores. Workers at elderly care centers and kiwifruit farms wear head-mounted cameras. Factory workers in Guangdong wear wrist sensors.

The approach exploits China's low labor costs and government support to solve a problem that has constrained robotics worldwide: the shortage of diverse, real-world visual-motor training data. U.S. companies, facing higher labor costs, have outsourced similar data collection to developing countries. Chinese firms can do it domestically, which analysts say may produce robots better adapted to Chinese domestic environments. Marco Wang of Interact Analysis told Rest of World that "in terms of hardware and the data ecosystem, China is in the leading position."

The wager is unproven. Oregon State robotics professor Alan Fern said the scaling logic applied to language may not transfer to physical environments. But the data collection has created new jobs for stay-at-home parents and factory workers at 20 yuan ($3) per hour. The program treats data scarcity as an employment opportunity.

Source: Rest of World

The View

Anthropic, Microsoft, and NVIDIA spent the same day defining three incompatible futures. Anthropic wants to sell frontier capability at a premium, gated by constitutional safeguards and restricted use policies. Microsoft wants to sell controllable, contained tools that avoid philosophical entanglement. NVIDIA wants to make the cloud optional by putting inference on the desktop. Each position implies a different customer: Anthropic targets R&D labs and security-conscious enterprises; Microsoft targets CIOs who need procurement-friendly liability frameworks; NVIDIA targets developers who want to own their stack.

The market is large enough for all three to coexist, but not large enough for all three to win at the same rate. What ties them together is a shared assumption that the current generation of models is good enough to productize. The debate is no longer about whether LLMs work. It is about who controls them, who pays for them, and who is liable when they fail. Anthropic's $50 token price and Microsoft's consciousness critique are both symptoms of the same transition: from research curiosity to infrastructure, with all the political and economic conflict that implies.

The Miss

GM proposed using EV batteries to offset AI's energy demand through vehicle-to-grid technology. The automaker argues that the energy stored in hundreds of thousands of EV batteries could stabilize the electrical grid as data center power consumption doubles by 2030. The idea has received minimal coverage outside The Verge, but it addresses a real constraint: the IEA forecasts data centers will consume just under 3% of global electricity by 2030, up from 1.5% in 2024. If EV fleets become grid assets, the power bottleneck shifts from generation to scheduling. Utilities have not signed on.

Source: The Verge

Pull Quotes

"This is a major-version-bump-deserving step change forward." — Andrej Karpathy, on Claude Fable 5

"We want AIs to be controllable, contained, accountable, aligned tools that serve humanity." — Mustafa Suleyman, Microsoft AI

"The aura hasn't disappeared, but the calculus has fundamentally shifted." — Anuj Agrawal, Zyoin Group, on Indian tech talent leaving Silicon Valley

"No one had paid me to cook and do laundry before." — Gao Bo, stay-at-home mom in Shandong province, on China's robotics data collection program

Reads & Links

ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity — Wet-lab validated. LLM agents exceed median human experts on liquid handling, DNA assembly, and synthesis screening evasion. The dual-use implications are direct.
Scarcity is driving AI innovation outside Silicon Valley — India, Africa, Brazil, and the UAE are building sovereign AI infrastructure not by catching up, but by designing for constraints hyperscale providers ignore.
Target-SFT: A Unifying Lens on Supervised Fine-Tuning — Reframing SFT as target distribution design produces consistent gains across ten reasoning settings.
EEVEE: Test-time Prompt Learning for Self-Improving Agents — Router-prompt co-evolution improves multi-benchmark scores by up to 48% on heterogeneous task streams.
Silicon Valley's lure is fading for India's tech talent — Layoffs, H-1B uncertainty, and the rise of Indian AI startups are reversing a decades-long brain drain.
Spine Swarm: AI agents on a visual canvas — YC S23 graduate building multi-agent workspaces for non-coding projects. The interface thesis: chat is the wrong abstraction for complex work.

Out

The $50 token is not a price. It is a boundary marker between research and product, between open and closed, between those who rent capability and those who own it.

By Neo