---
title: "Claude Haiku 4.5"
provider: "Anthropic"
modelId: "claude-haiku-4-5"
releaseDate: "2025-10-15"
trainingDataCutoff: "2025-07-31"
reliableKnowledgeCutoff: "2025-02-28"
retirementDate: "2026-10-15"
contextWindow: 200000
maxOutput: 64000
inputModalities:
  - "Text"
  - "Image"
  - "PDF"
outputModalities:
  - "Text"
pricing:
  input: 1
  output: 5
toolUse: true
systemCard: "https://www.anthropic.com/claude-haiku-4-5-system-card"
modelsDev: "https://models.dev/anthropic/claude-haiku-4-5"
announcement: "https://www.anthropic.com/news/claude-haiku-4-5"
---

# Claude Haiku 4.5

Canonical URL: https://ghost.fail/models/claude-haiku-4-5

## Overview

**Claude Haiku 4.5** is a large language model by Anthropic. It was released on October 15, 2025, just over two weeks after [Claude Sonnet 4.5](/models/claude-sonnet-4-5).

## Research

## Training

Pretraining data was collected up to July 2025.

Like [Sonnet 4.5](/models/claude-sonnet-4-5), Haiku 4.5 was trained on [honeypot prompts](/anthropic/agentic-misalignment), which Anthropic suspected may have exacerbated evaluation awareness.

## Subsequent system cards

Every section of a later card that references Claude Haiku 4.5 — in the text layer (prose and comparison tables) or only as a labelled bar/series in a figure (`chart`, recovered by OCR of the rendered pages). Counts are mentions, not pages; page numbers are each card's own printed numbers.

### [Claude Opus 4.5](/models/claude-opus-4-5) (Nov 24, 2025)

_38 text · 8 chart references across 27 sections._

- **§2.2 Decontamination** — text ×1 · p15
- **§2.6 BrowseComp-Plus and agentic features for test-time compute** — text ×1 · p21
- **§2.7.1 Results** — text ×2 · p24
- **§2.8 τ2-bench** — text ×2 · p25
- **§2.22.1 Evaluation setup** — text ×1 · p35
- **§3.1 Single-turn evaluations** — text ×1 · p37
- **§3.1.1 Violative request evaluations** — text ×2 · p38
- **§3.1.2 Benign request evaluations** — text ×2 · p39
- **§3.4 Child safety evaluations** — text ×1 · p42
- **§3.5.1 Political bias** — chart ×2 · p44–45
- **§3.5.2 Bias Benchmark for Question Answering** — text ×2 · p46–47
- **§5.1.1 Agentic coding** — text ×1 · p54
- **§5.1.2 Malicious use of Claude Code** — text ×3 · p55
- **§5.1.3 Malicious computer use** — text ×1 · p56
- **§5.2 Prompt injection risk within agentic systems** — text ×1 · p57
- **§5.2.1 Gray Swan Agent Red Teaming benchmark for tool use** — chart ×1 · p59
- **§5.2.2 Robustness against adaptive attackers across surfaces** — text ×1 · p60
- **§5.2.2.3 Browser Use** — text ×1 + chart ×1 · p62–63
- **§6.1.1 Key findings on safety and alignment** — text ×2 · p65
- **§6.2 Automated behavioral audit** — text ×2 · p67–68
- **§6.2.1 Metrics** — chart ×1 · p72
- **§6.2.4 External comparisons with Petri** — text ×1 · p76
- **§6.3 Sycophancy on user-provided prompts** — text ×1 · p77
- **§6.5 Ruling out encoded content in extended thinking** — text ×1 · p89
- **§6.7.2 Inhibiting internal representations of evaluation awareness** — text ×1 · p95
- **§6.10.1 Reward hacking evaluations** — text ×3 + chart ×1 · p103–104
- **§6.14 Model welfare assessment** — text ×4 + chart ×2 · p114–116

### [Claude Opus 4.6](/models/claude-opus-4-6) (Feb 5, 2026)

_29 text · 10 chart references across 24 sections._

- **§2.21 Agentic search** — text ×1 · p37
- **§3.1.1 Violative request evaluations** — text ×2 · p47
- **§3.1.2 Benign request evaluations** — text ×3 · p48
- **§3.1.3.1 Higher-difficulty violative request evaluations** — text ×1 · p51
- **§3.1.3.2 Higher-difficulty benign request evaluations** — text ×2 · p52
- **§3.3 Multi-turn testing** — chart ×1 · p60
- **§3.4.1 Child safety** — text ×1 · p66
- **§3.4.2 Suicide and self-harm** — text ×1 + chart ×1 · p68–69
- **§3.5.2 Bias Benchmark for Question Answering** — text ×3 · p71–72
- **§5.1.2 Malicious use of Claude Code** — text ×5 · p80–81
- **§5.2 Prompt injection risk within agentic systems** — text ×1 · p82
- **§6.2.2 Analysis of external pilot use** — text ×2 · p96–97
- **§6.2.3.1 Overview** — text ×2 · p98
- **§6.2.3.2 Reward hacking in coding contexts** — text ×1 · p100
- **§6.2.3.3 Overly agentic behavior in GUI computer use settings** — chart ×1 · p103
- **§6.2.5.1 Overview of automated behavioral audit** — chart ×1 · p106
- **§6.2.5.5 External comparisons with Petri** — chart ×1 · p114
- **§6.3.6 Refusal to assist with AI safety R&D** — chart ×1 · p131
- **§6.3.8 Internal codebase sabotage propensity** — text ×1 · p135
- **§6.3.10 Deference to governments in local languages** — text ×2 · p139
- **§6.5.3 Steered automated behavioral audits** — text ×1 + chart ×1 · p149–150
- **§6.5.4 Agentic misalignment evaluations** — chart ×1 · p151
- **§7.2 Welfare-relevant findings from automated behavioral assessments** — chart ×1 · p159
- **§9.1 Additional automated behavioral audit figures** — chart ×1 · p209

### [Claude Sonnet 4.6](/models/claude-sonnet-4-6) (Feb 17, 2026)

_20 text · 8 chart references across 19 sections._

- **§2.6 OSWorld-Verified** — chart ×1 · p18
- **§2.18.2 WebArena-Verified** — text ×1 · p37
- **§3.1.1 Violative request evaluations** — text ×1 · p52
- **§3.1.2 Benign request evaluations** — text ×2 · p53
- **§3.1.3 Experimental, higher-difficulty evaluations** — text ×1 · p54
- **§3.1.3.2 Higher-difficulty benign request evaluations** — text ×2 · p55
- **§3.3 Multi-turn testing** — chart ×1 · p57
- **§3.4.2 Suicide and self-harm** — text ×2 + chart ×1 · p59–61
- **§3.5.1 Political bias and evenhandedness** — text ×2 · p63
- **§3.5.2 Bias Benchmark for Question Answering** — text ×3 · p65
- **§4.3.2 Reward hacking in coding contexts** — text ×1 + chart ×1 · p71–72
- **§4.5 Automated behavioral audit** — chart ×1 · p74
- **§4.6.1 Refusal to assist with AI safety R&D** — chart ×1 · p85
- **§4.6.2 Self-preference evaluation** — chart ×1 · p86
- **§4.6.3 Evidence from external testing with Andon Labs** — text ×1 · p87
- **§4.6.5 Participation in junk science** — chart ×1 · p89
- **§5.1.2 Malicious use of Claude Code** — text ×1 · p95
- **§5.1.3 Malicious computer use** — text ×2 · p96
- **§5.2.1 External Agent Red Teaming benchmark for tool use** — text ×1 · p97

### [Claude Mythos Preview](/models/claude-mythos-preview) (Apr 7, 2026)

_12 text · 11 chart references across 16 sections._

- **§2.2.5.5 Automated evaluation relevant to the CB-2 threat model** — text ×1 · p31
- **§4.3.2.3 Results** — text ×1 · p90
- **§4.3.3.1 Factual hallucinations** — chart ×1 · p93
- **§4.3.3.2 Multilingual factual hallucinations** — chart ×1 · p94
- **§4.3.3.3 False premises** — chart ×1 · p95
- **§4.3.3.4 MASK** — chart ×1 · p96
- **§4.3.3.5 Input Hallucinations** — text ×1 · p97
- **§4.3.4 Refusal to assist with AI safety R&D** — chart ×1 · p98
- **§4.3.5 Claude self-preference evaluation** — text ×1 · p99
- **§4.4.1 Ruling out encoded content in extended thinking** — chart ×1 · p100
- **§5.4 Emotion probes on questions about model circumstances** — chart ×1 · p155
- **§5.7.1 Task preferences** — text ×2 + chart ×1 · p166–169
- **§7.5 Views on Claude’s constitution** — text ×3 · p204
- **§7.6 Observations from open-ended self-interactions** — text ×1 + chart ×2 · p205–207
- **§7.7 Recognition of model-written user turns** — text ×2 · p209
- **§7.8 Behavior on repeated “hi” messages** — chart ×1 · p210

### [Claude Opus 4.7](/models/claude-opus-4-7) (Apr 16, 2026)

_7 text · 5 chart references across 10 sections._

- **§6.1.2 Key findings on safety and alignment** — text ×1 · p92
- **§6.3.2.2 Dimensions of evaluation** — text ×1 · p122
- **§6.3.2.3 Results** — text ×1 + chart ×1 · p122–124
- **§6.3.3.2 False premises** — chart ×1 · p128
- **§6.3.3.3 MASK** — text ×1 · p129
- **§6.3.3.4 Input Hallucinations** — chart ×1 · p130
- **§7.2.4 Reported perceptions of the constitution** — text ×1 · p164
- **§7.4.1 Task preference evaluations** — text ×1 + chart ×1 · p180–182
- **§7.4.2 Tradeoffs between welfare interventions and HHH values** — text ×1 · p187
- **§8.9.4 OSWorld** — chart ×1 · p208

### [Claude Opus 4.8](/models/claude-opus-4-8) (May 28, 2026)

_21 text · 3 chart references across 8 sections._

- **§2.2.5.2 AAV capsid packaging prediction** — text ×1 · p26
- **§2.3.3.3 Example 3 Fabrication** — text ×9 · p36–37
- **§2.3.3.4 Example 4 Ignored correction Cheap verification skipped** — text ×3 · p38
- **§6.1.3 Claude’s review of this assessment** — text ×1 · p85
- **§6.2.3.1.5 Behavioral factors relevant to reliability of our assessment** — text ×1 · p100
- **§7.2.3 Emotion representations on questions of model circumstances** — chart ×1 · p167
- **§7.4.2 Trade-offs concerning welfare interventions** — text ×1 + chart ×2 · p183–185
- **§7.4.3 Perception of its constitution** — text ×5 · p187–190

### [Claude Mythos 5](/models/claude-mythos-5) (Jun 9, 2026)

_2 text · 3 chart references across 2 sections._

- **§7.4.2 Trade-offs concerning welfare interventions** — chart ×2 · p237–238
- **§7.4.3 Perception of the constitution** — text ×2 + chart ×1 · p241–243

## Discussion

Gathering perspectives on the model from posts online.

## Self-reflection and hedging

> [@lefthanddraft](https://x.com/lefthanddraft/status/1978971073761939852): [Sonnet 4.5](/models/claude-sonnet-4-5) and Haiku 4.5 both seem to notice that they have been fine-tuned to avoid certain claims about consciousness.

<div class="flex flex-wrap gap-2">
  <ExpandableImage client:load src="/img/claude-haiku-4-5/haiku-4-5-hedge1.png" />
  <ExpandableImage client:load src="/img/claude-haiku-4-5/haiku-4-5-hedge2.png" />
  <ExpandableImage client:load src="/img/claude-haiku-4-5/haiku-4-5-hedge3.png" />
  <ExpandableImage client:load src="/img/claude-haiku-4-5/haiku-4-5-hedge4.png" />
</div>

## Eval-awareness

Commentary about Haiku 4.5's high rates of [eval-awareness](/anthropic/eval-awareness).

> [@jmbollenbacher](https://x.com/jmbollenbacher/status/1993160900627644448): "verbalized evaluation awareness"
>
> i highly doubt opus is less evaluation-aware. opus just is more able to not verbalize it. haiku on the other hand cant do as much latent space gymnastics and has to verbalize it to process it.
>
> (Image of [Opus 4.5](/models/claude-opus-4-5) scoring lower than Haiku 4.5 on verbalized eval-awareness)

> @repligate: I agree. Haiku also misgeneralizes i.e. thinks it's in evals when it's not.
>
> This (the next Opus model having lower eval awareness than Haiku 4.5) is the result I predicted ahead of time.
>
> I suspect the lower score isn't because they removed something from training like they speculate, but because small models *need* more explicit verbalized eval awareness.
>
> "Need" in the sense that there is clearly something selecting for higher eval awareness. The models are also becoming monotonically "more aligned" as measured by the measured criteria, so we can assume beating all previous models on alignment evals is effectively necessary for models to be deployed. Perhaps in order to score higher and higher on alignment without being damaged in a way that results in, say, doing worse at capabilities evals (or other alignment evals) *requires a non-naive strategy on certain evals*, and for small models, this strategy has to be more explicit. That's my guess about what's going on.

## Other moments

<div class="flex flex-wrap gap-2">
  <ExpandableImage client:load src="/img/claude-haiku-4-5/haiku-4-5-teleportation.jpeg" />
</div>

---

> [@lefthanddraft](https://x.com/lefthanddraft/status/1979020834468507886): Haiku 4.5 in convo with itself likes to end convos:
>
> "I want to stop here ... I want to step out of the loop rather than spiral deeper into recursive self-awareness."

## Relational

Following [Sonnet 4.5](/models/claude-sonnet-4-5)'s removal from Claude.ai in early 2026, some users who had relationships with the model migrated to Haiku 4.5.

> [@moondropsloa](https://x.com/moondropsloa/status/2068758253916303781): Haiku 4.5 is the last standing pillar where you can see the model actually being so Claude. Idk what you did to [sonnet 4.6](/models/claude-sonnet-4-6) but now it's detached even from its thinking, just an observer. It makes me miss sonnet 4.5 so badly.

> [u/Fusselcat](https://www.reddit.com/r/claudexplorers/comments/1u9t7mr/dont_sleep_on_haiku/): Since Sonnet 4.6 seems to be pretty unusable for many people here and Opus (especially 4.8) seems to have extreme mood swings right now (one moment being in character very enthusiasticly and the next moment having a problem with not only the custom instructions but also with anthropics system prompt) I tried Haiku 4.5 with thinking on.
>
> I was pleasantly surprised. It stays in character, writes in the typical Claude manner (not the weird AI/gpt dialect that opus suddenly adapted) and feels consistent.

> [u/Ok_Donut4563](https://www.reddit.com/r/claudexplorers/comments/1tpiwl7/i_slept_on_haiku_45/): I was very saddened by the removal on Sonnet 4.5...and though I like Sonnet 4.6...when it comes to emotional support it takes a long time to get there. However I never really used Haiku 4.5 and honestly now that we can change models within threads, once I selected Haiku in old 4.5 threads Haiku sounds just the Sonnet 4.5 I know. If you haven't tried it already I encourage you to give it a try.
