Claude API Pricing Calculator

Claude API Pricing Calculator

Claude API Pricing Calculator

Select Claude Model

System prompt + user message. ~4 chars = 1 token.
Tokens in Claude's response per call.
Total API calls in a typical month.
Anthropic bills in USD. Others are indicative.
Enable Prompt Caching Estimate Cached input tokens billed at ~10% of standard rate. First write billed at ~125%.
Estimated Cost Breakdown
Estimated Monthly Cost
Cost / Request
Input Token Cost
Output Token Cost
Total Input Tokens
Total Output Tokens
Total Monthly Tokens
Estimated Annual Cost
💾 Prompt Caching Impact
Without Caching
With Caching
Monthly Saving
Note: Estimates use Anthropic's published list prices. Actual bills may differ due to prompt caching, batch API discounts, rounding, or future price changes. Verify at anthropic.com/pricing.

What Is Claude API Pricing?

Claude API pricing is a pay-per-token model where you pay separately for text you send (input tokens) and text Claude generates (output tokens). There is no monthly subscription — you pay only for what you use, billed in USD per million tokens. Prices range from $0.25 to $30.00 per million input tokens depending on the model chosen.

Anthropic charges for API access based on token consumption. Because input and output tokens carry different rates — and output tokens typically cost more per million — understanding your prompt-to-response ratio is the most important variable when projecting costs. The model family you choose (Haiku, Sonnet, or Opus) and the generation (3, 3.5, 4, 4.5, 4.6) both significantly affect your bill.

Newer model generations like Claude Sonnet 4.6 and Opus 4.6 offer larger 1M-token context windows alongside competitive pricing, making them suitable for long-document analysis, agentic workflows, and complex multi-step tasks that older generations could not handle cost-effectively.

Source: Anthropic, "Claude API Pricing," anthropic.com/pricing, accessed July 2025.

All Claude Models & Prices

The table below lists every Claude model currently documented by Anthropic. Models marked "N/A" for price are deprecated or legacy versions no longer available via the standard API; they are included for reference. Use the search and filter controls to find specific models. Click any column header to sort the table.

Model Tier Input $/M Output $/M Context Released Weekly Usage

Sources: Anthropic, "Claude API Pricing," anthropic.com/pricing; Anthropic, "Models Overview," docs.anthropic.com. Weekly token usage figures sourced from OpenRouter usage data. Prices and availability subject to change.

Understanding Tokens

A token is the smallest unit of text that Claude processes. Tokens do not map one-to-one with words — they are vocabulary fragments defined by the model's tokenizer. Understanding approximate token counts helps you predict costs accurately before running any API calls.

≈ 4 English characters per token on average
≈ 0.75 English words per token
1,333 Tokens in a typical 1,000-word document
100K Tokens ≈ a 75,000-word novel
↑ More Tokens per word for code, math & non-English text
usage{} API response field reporting exact token counts per call

The most reliable approach is to instrument your code to log the usage object returned in every API response. It contains input_tokens and output_tokens for each call. Average 20–30 representative requests, then use those real numbers in the calculator above.

Source: Anthropic, "Token counting," docs.anthropic.com, accessed July 2025.

How to Estimate Your Claude API Costs

The formula the calculator above uses is straightforward. You can verify any result manually:

Monthly Cost =
  (Input Tokens/Req  × Requests/Month × Input Price  ÷ 1,000,000)
+ (Output Tokens/Req × Requests/Month × Output Price ÷ 1,000,000)

Cost per Request = Monthly Cost ÷ Requests per Month
Annual Cost      = Monthly Cost × 12
Source: Anthropic per-million-token pricing model — anthropic.com/pricing

A practical approach: run 20–30 representative API calls in development and log the usage field from each response. Average the input and output token counts, then multiply by your expected monthly request volume. If you are in the planning phase with no usage data, use these benchmarks: a typical chatbot turn uses 500–1,500 input tokens and 200–500 output tokens; a document summarisation task might use 3,000–8,000 input tokens and 300–800 output tokens.

Prompt Caching Explained

Prompt caching lets you mark a repeated portion of your prompt — such as a long system prompt or reference document — so Anthropic does not reprocess it on every request. Cached tokens are billed at approximately 10% of the standard input token rate, which can substantially cut costs for workflows that reuse large fixed contexts.

To use prompt caching, add a cache_control parameter to the content blocks you want cached. The first request that creates the cache is billed at approximately 125% of the standard input rate (the write cost); all subsequent requests hitting that cache are billed at approximately 10% (the read cost). Anthropic stores the cache for up to 5 minutes, extended by additional cache hits.

Prompt caching is most valuable when: your system prompt exceeds 1,024 tokens; you pass the same large document or knowledge base with every request; or you run multi-turn conversations where earlier turns are prepended to each new call. Use the caching toggle in the calculator above to model the potential saving for your specific usage pattern.

Source: Anthropic, "Prompt Caching," docs.anthropic.com, accessed July 2025.

Cost Reduction Tips

  • Route tasks to the right model Build a routing layer that sends simple tasks to Claude Haiku 4.5 ($1/$5 per M tokens) and reserves Sonnet or Opus variants for tasks requiring higher capability. Haiku handles most classification, extraction, and short-form generation tasks at a fraction of the cost of flagship models.
  • Compress your system prompt Every token in your system prompt is billed on every request. Audit yours regularly: remove redundant instructions, rewrite verbose directives concisely, and delete examples Claude does not need. A reduction from 800 to 250 tokens saves 550 input tokens per call. At 50,000 calls/month on Sonnet 4 ($3/M), that saves $82.50 per month.
  • Set explicit max_tokens limits Pass max_tokens in every request. Audit a sample of real responses, find the 95th percentile output length for your use case, and set max_tokens roughly 20% above that value. This prevents occasional runaway responses from inflating your output token costs without cutting off legitimate answers.
  • Use the Batch API for non-real-time work Anthropic's Message Batches API processes requests asynchronously at a 50% discount on both input and output token prices. If your workload includes tasks that do not require an immediate response — bulk analysis, overnight processing, report generation — the Batch API is the single largest cost lever available.
  • Trim conversation history in multi-turn apps In chatbot or agent applications, every previous message is included as input on subsequent calls. Implement a sliding window keeping only the N most recent turns, or summarise older turns into a compact block. Log token counts per session to identify conversations with unusually high input costs.

Price Source & Calculator Accuracy

All model prices are sourced from Anthropic's official pricing page and model documentation, last verified July 2025. Anthropic may update prices, introduce new models, or deprecate older ones at any time without prior notice. Models listed as N/A for pricing are either deprecated or available only through enterprise arrangements.

The calculator's arithmetic is exact: it applies published per-million-token rates to the values you enter. Accuracy of your estimate depends on how closely your entered token counts and request volumes match real usage. The best way to improve accuracy is to instrument production code to log the usage field from each API response, then use real averages as inputs.

Non-USD currency amounts are indicative conversions using approximate exchange rates. Anthropic invoices exclusively in US Dollars.

Primary source: Anthropic, "Claude API Pricing," anthropic.com/pricing; Anthropic, "Models Overview," docs.anthropic.com/en/docs/about-claude/models. Weekly usage data: OpenRouter model stats. Accessed July 2025.