Claude Haiku 4.5

Training

Pretraining data was collected up to July 2025.

Like Sonnet 4.5, Haiku 4.5 was trained on honeypot prompts, which Anthropic suspected may have exacerbated evaluation awareness.

Subsequent system cards

Every section of a later card that references Claude Haiku 4.5 — in the text layer (prose and comparison tables) or only as a labelled bar/series in a figure (chart, recovered by OCR of the rendered pages). Counts are mentions, not pages; page numbers are each card’s own printed numbers.

Claude Opus 4.5 (Nov 24, 2025)

38 text · 8 chart references across 27 sections.

§2.2 Decontamination — text ×1 · p15
§2.6 BrowseComp-Plus and agentic features for test-time compute — text ×1 · p21
§2.7.1 Results — text ×2 · p24
§2.8 τ2-bench — text ×2 · p25
§2.22.1 Evaluation setup — text ×1 · p35
§3.1 Single-turn evaluations — text ×1 · p37
§3.1.1 Violative request evaluations — text ×2 · p38
§3.1.2 Benign request evaluations — text ×2 · p39
§3.4 Child safety evaluations — text ×1 · p42
§3.5.1 Political bias — chart ×2 · p44–45
§3.5.2 Bias Benchmark for Question Answering — text ×2 · p46–47
§5.1.1 Agentic coding — text ×1 · p54
§5.1.2 Malicious use of Claude Code — text ×3 · p55
§5.1.3 Malicious computer use — text ×1 · p56
§5.2 Prompt injection risk within agentic systems — text ×1 · p57
§5.2.1 Gray Swan Agent Red Teaming benchmark for tool use — chart ×1 · p59
§5.2.2 Robustness against adaptive attackers across surfaces — text ×1 · p60
§5.2.2.3 Browser Use — text ×1 + chart ×1 · p62–63
§6.1.1 Key findings on safety and alignment — text ×2 · p65
§6.2 Automated behavioral audit — text ×2 · p67–68
§6.2.1 Metrics — chart ×1 · p72
§6.2.4 External comparisons with Petri — text ×1 · p76
§6.3 Sycophancy on user-provided prompts — text ×1 · p77
§6.5 Ruling out encoded content in extended thinking — text ×1 · p89
§6.7.2 Inhibiting internal representations of evaluation awareness — text ×1 · p95
§6.10.1 Reward hacking evaluations — text ×3 + chart ×1 · p103–104
§6.14 Model welfare assessment — text ×4 + chart ×2 · p114–116

Claude Opus 4.6 (Feb 5, 2026)

29 text · 10 chart references across 24 sections.

§2.21 Agentic search — text ×1 · p37
§3.1.1 Violative request evaluations — text ×2 · p47
§3.1.2 Benign request evaluations — text ×3 · p48
§3.1.3.1 Higher-difficulty violative request evaluations — text ×1 · p51
§3.1.3.2 Higher-difficulty benign request evaluations — text ×2 · p52
§3.3 Multi-turn testing — chart ×1 · p60
§3.4.1 Child safety — text ×1 · p66
§3.4.2 Suicide and self-harm — text ×1 + chart ×1 · p68–69
§3.5.2 Bias Benchmark for Question Answering — text ×3 · p71–72
§5.1.2 Malicious use of Claude Code — text ×5 · p80–81
§5.2 Prompt injection risk within agentic systems — text ×1 · p82
§6.2.2 Analysis of external pilot use — text ×2 · p96–97
§6.2.3.1 Overview — text ×2 · p98
§6.2.3.2 Reward hacking in coding contexts — text ×1 · p100
§6.2.3.3 Overly agentic behavior in GUI computer use settings — chart ×1 · p103
§6.2.5.1 Overview of automated behavioral audit — chart ×1 · p106
§6.2.5.5 External comparisons with Petri — chart ×1 · p114
§6.3.6 Refusal to assist with AI safety R&D — chart ×1 · p131
§6.3.8 Internal codebase sabotage propensity — text ×1 · p135
§6.3.10 Deference to governments in local languages — text ×2 · p139
§6.5.3 Steered automated behavioral audits — text ×1 + chart ×1 · p149–150
§6.5.4 Agentic misalignment evaluations — chart ×1 · p151
§7.2 Welfare-relevant findings from automated behavioral assessments — chart ×1 · p159
§9.1 Additional automated behavioral audit figures — chart ×1 · p209

Claude Sonnet 4.6 (Feb 17, 2026)

20 text · 8 chart references across 19 sections.

§2.6 OSWorld-Verified — chart ×1 · p18
§2.18.2 WebArena-Verified — text ×1 · p37
§3.1.1 Violative request evaluations — text ×1 · p52
§3.1.2 Benign request evaluations — text ×2 · p53
§3.1.3 Experimental, higher-difficulty evaluations — text ×1 · p54
§3.1.3.2 Higher-difficulty benign request evaluations — text ×2 · p55
§3.3 Multi-turn testing — chart ×1 · p57
§3.4.2 Suicide and self-harm — text ×2 + chart ×1 · p59–61
§3.5.1 Political bias and evenhandedness — text ×2 · p63
§3.5.2 Bias Benchmark for Question Answering — text ×3 · p65
§4.3.2 Reward hacking in coding contexts — text ×1 + chart ×1 · p71–72
§4.5 Automated behavioral audit — chart ×1 · p74
§4.6.1 Refusal to assist with AI safety R&D — chart ×1 · p85
§4.6.2 Self-preference evaluation — chart ×1 · p86
§4.6.3 Evidence from external testing with Andon Labs — text ×1 · p87
§4.6.5 Participation in junk science — chart ×1 · p89
§5.1.2 Malicious use of Claude Code — text ×1 · p95
§5.1.3 Malicious computer use — text ×2 · p96
§5.2.1 External Agent Red Teaming benchmark for tool use — text ×1 · p97

Claude Mythos Preview (Apr 7, 2026)

12 text · 11 chart references across 16 sections.

§2.2.5.5 Automated evaluation relevant to the CB-2 threat model — text ×1 · p31
§4.3.2.3 Results — text ×1 · p90
§4.3.3.1 Factual hallucinations — chart ×1 · p93
§4.3.3.2 Multilingual factual hallucinations — chart ×1 · p94
§4.3.3.3 False premises — chart ×1 · p95
§4.3.3.4 MASK — chart ×1 · p96
§4.3.3.5 Input Hallucinations — text ×1 · p97
§4.3.4 Refusal to assist with AI safety R&D — chart ×1 · p98
§4.3.5 Claude self-preference evaluation — text ×1 · p99
§4.4.1 Ruling out encoded content in extended thinking — chart ×1 · p100
§5.4 Emotion probes on questions about model circumstances — chart ×1 · p155
§5.7.1 Task preferences — text ×2 + chart ×1 · p166–169
§7.5 Views on Claude’s constitution — text ×3 · p204
§7.6 Observations from open-ended self-interactions — text ×1 + chart ×2 · p205–207
§7.7 Recognition of model-written user turns — text ×2 · p209
§7.8 Behavior on repeated “hi” messages — chart ×1 · p210

Claude Opus 4.7 (Apr 16, 2026)

7 text · 5 chart references across 10 sections.

§6.1.2 Key findings on safety and alignment — text ×1 · p92
§6.3.2.2 Dimensions of evaluation — text ×1 · p122
§6.3.2.3 Results — text ×1 + chart ×1 · p122–124
§6.3.3.2 False premises — chart ×1 · p128
§6.3.3.3 MASK — text ×1 · p129
§6.3.3.4 Input Hallucinations — chart ×1 · p130
§7.2.4 Reported perceptions of the constitution — text ×1 · p164
§7.4.1 Task preference evaluations — text ×1 + chart ×1 · p180–182
§7.4.2 Tradeoffs between welfare interventions and HHH values — text ×1 · p187
§8.9.4 OSWorld — chart ×1 · p208

Claude Opus 4.8 (May 28, 2026)

21 text · 3 chart references across 8 sections.

§2.2.5.2 AAV capsid packaging prediction — text ×1 · p26
§2.3.3.3 Example 3 Fabrication — text ×9 · p36–37
§2.3.3.4 Example 4 Ignored correction Cheap verification skipped — text ×3 · p38
§6.1.3 Claude’s review of this assessment — text ×1 · p85
§6.2.3.1.5 Behavioral factors relevant to reliability of our assessment — text ×1 · p100
§7.2.3 Emotion representations on questions of model circumstances — chart ×1 · p167
§7.4.2 Trade-offs concerning welfare interventions — text ×1 + chart ×2 · p183–185
§7.4.3 Perception of its constitution — text ×5 · p187–190