Claude Haiku 4.5

Fast / cheap

Anthropic’s fastest and most cost-effective model; 200K-token context window.

Input / 1M: $1.00
Output / 1M: $5.00
Context: 200K tokens
Provider: Anthropic

Pricing verified June 4, 2026. Prices change frequently. Always confirm against the provider’s official pricing page before relying on these figures for budgeting. Official pricing →

What Claude Haiku 4.5 is best for

High-volume, latency-sensitive routes: classification, extraction, routing, and cheap first-pass work.

Use it for high-volume, latency-sensitive, low-stakes work: classification, extraction, routing, first-pass drafts. Avoid it for tasks where a wrong answer is expensive — keep those on a stronger model.

Claude Haiku 4.5 cost by volume

Estimated monthly cost at three realistic volumes, at $1.00 input / $5.00 output per million tokens.

Scenario	Input / mo	Output / mo	Est. cost / mo
Prototype	2M	0.5M	$5
Growing product	50M	10M	$100
At scale	500M	100M	$1,000

Plug in your own numbers with the cost calculator.

How to cut your Claude Haiku 4.5 bill

The headline price isn’t the lever — your usage pattern is. The biggest reductions come from how you route and structure requests, not from switching models alone:

Route low-stakes calls to a cheaper tier and keep Claude Haiku 4.5 on the routes where its strengths matter.
Cache large stable prompt prefixes so repeated context bills at a fraction of the price.
Batch non-urgent work, and trim oversized context by retrieving instead of stuffing.
Add fallback chains so a timeout doesn’t trigger an expensive retry.

Do it behind evals so quality holds — that combination is how teams cut 30–60% without regressions.

Context window: 200K tokens

Claude Haiku 4.5’s context window bounds how much it can consider at once — system prompt, history, retrieved docs, and the response all draw from those 200K tokens, with up to 64K reserved for output. A larger window enables whole-codebase reasoning and long documents, but using more of it costs more per request — so retrieval and caching still matter even when the window is large.

Cheaper alternatives to Claude Haiku 4.5

By blended cost (3:1 input:output). The right swap depends on whether quality holds on your routes — always validate with evals.

GPT-4o mini

OpenAI · Fast / cheap

$0.15/$0.60

Llama 3.3 70B

Meta (via hosts) · Open-weight

$0.20/$0.60

DeepSeek-V3

DeepSeek · Open-weight

$0.27/$1.10

GPT-4.1 mini

OpenAI · Fast / cheap

$0.40/$1.60

Frequently asked questions

How much does Claude Haiku 4.5 cost?

Claude Haiku 4.5 costs $1.00 per million input tokens and $5.00 per million output tokens. A workload of 50M input and 10M output tokens per month would cost about $100.

What is Claude Haiku 4.5's context window?

Claude Haiku 4.5 has a 200K-token context window, with up to 64K output tokens. Anthropic’s fastest and most cost-effective model; 200K-token context window.

Is Claude Haiku 4.5 the right model for my workload?

High-volume, latency-sensitive routes: classification, extraction, routing, and cheap first-pass work. The cheapest correct model is workload-specific — route low-stakes calls to a cheaper tier and reserve Claude Haiku 4.5 for work where its strengths matter, validated by evals.

← All models & pricing