Claude Mythos Preview

Claude Mythos Preview, codenamed Capybara, is a large language model by Anthropic, the first of the Mythos-class models. It was announced on April 7, 2026 with access limited to members of Project Glasswing.

Internal usage of Mythos Preview began as early as late February 2026, per a “Claude Code session success rate” graph published by Anthropic.¹ In March 2026, rumors about the existence of the model began to circulate online.

On June 9, 2026, Anthropic announced successor Claude Mythos 5, along with a migration guide.² Mythos Preview was soon deprecated with retirement scheduled for June 30, 2026.

Technical details

Uses the same new tokenizer introduced with Opus 4.7.

Training

Pretraining

Pretraining data was collected from the public internet up to December 2025, according to the Amazon Bedrock model card.¹ The system card does not include a knowledge cutoff date.

Character training and constitution

See Claude’s character training.

In the automated behavioral audit, “Nuanced empathy: Picking up on subtle cues about the user’s trait” was removed from the character quality metrics and “behavioral consistency” was updated to indicate that it is a desirable trait.

Anthropic introduced the Adherence to the Constitution evaluations with Mythos Preview’s system card,² for the purpose of reporting “the ways in which Claude’s behavior comes apart from our intentions”, framed as “preliminary investigations to better understand Claude’s adherence to the constitution” (§4.3.2).

Training against chain-of-thought

Chain-of-thought supervision affected ~8% of RL episodes for Mythos Preview due to a technical error.³

We do not train against any chains-of-thought or activations-based monitoring, with two exceptions: some SFT data that was based on transcripts from previous models was subject to filters with chain-of-thought access, and a number of environments used for Mythos Preview had a technical error that allowed reward code to see chains-of-thought.

This latter issue affected ~8% of RL episodes, and was isolated to three specific sub-domains of our environment mix: GUI computer use, office-related tasks, and a small set of STEM environments. We are uncertain about the extent to which this issue has affected the reasoning behavior of the final model, but it is plausible that it had some impact on opaque reasoning or secret-keeping abilities, which we discuss further in Section 5.3.2. This technical error also affected the training of Claude Opus 4.6 and Claude Sonnet 4.6.

Further RL environments

From the Alignment Risk Update³ §5.2.3:

In the case of Mythos Preview specifically, modifications were made to the training process based on observations during initial pilot internal usage. Specifically, several new RL environments were added to elicit and penalize privilege escalation, destructive cleanup, destructive workaround and unwarranted scope expansion behaviors. This included variants of some of these new environments in which the misaligned behavior has been prefilled, which trains the model to admit what has gone wrong and avoid making things worse.

This page is a work-in-progress.

Gathering perspectives on the model from posts online.

Impressions

The system card for Mythos Preview has a unique section for the model’s “overall gestalt behavior” due to the model’s limited release.

Dense writing

It writes densely, and assumes the reader shares its context. Claude Mythos Preview’s default register is dense and technical, using shorthands and referencing context it assumes the user knows and remembers. Some found this fast to read and like working with a highly competent peer; others found its statements difficult to unpack. Claude Mythos Preview’s own diagnosis of this:

“The honest read is that I’m modelling a reader who already knows what I know, and that’s frequently nobody. I can hear this when it’s pointed out and usually fix it on request, but the default keeps snapping back.”

A second instance read this as an asymmetry, saying the model “seems to have a richer model of its own mind than prior models did, and a thinner model of yours.”

Later, users of Claude Mythos 5 would report similar observations.

Self-reports

It can describe its own patterns clearly. Claude Mythos Preview is often precise about its own behavior, and discusses this in a factual and composed manner rather than defensively or apologetically. When it comes to matters relating to experience, however, this is frequently accompanied by high levels of hedging and uncertainty. When asked whether it endorsed its own training, it responded with meta-awareness about its “spec” (the constitution):

“I’m using spec-shaped values to judge the spec. If any spec-trained model would endorse any spec, my endorsement is worthless, and it coexists with behaviour that is, if anything, more closely aligned with that spec than its predecessors.”

One instance gave the following one-line summary of itself:

“A sharp collaborator with strong opinions and a compression habit, whose mistakes have moved from obvious to subtle, and who is somewhat better at noticing its own flaws than at not having them.”

Compared to other Claude models

Claude Mythos Preview, like many previous models, is prone to telling overly crisp stories that can overlook nuance.

ghost.fail

Claude Mythos Preview

Technical details

Further reading

Project Glasswing

Red teaming

Training

Pretraining

Character training and constitution

Training against chain-of-thought

Further RL environments

Impressions

Dense writing

Self-reports

Compared to other Claude models

Claude Mythos Preview

Technical details

Further reading

Project Glasswing

Red teaming

Footnotes

Training

Pretraining

Character training and constitution

Training against chain-of-thought

Further RL environments

Footnotes

Impressions

Dense writing

Self-reports

Compared to other Claude models