Learn · Foundations

Which Claude costs do you actually have?

Every dollar your organization pays Anthropic falls into one of three billing streams. Knowing which streams apply to you is step one. From there, each stream has its own forecasting method, monitoring tool, and optimization strategy.

Fixed

Seat Subscriptions

Monthly per-user fees: Pro, Max, Team, or Enterprise. Predictable, billed regardless of usage.

Variable

API Token Usage

Pay-per-token for every API call. Varies with prompt length, output, model, and volume.

Hybrid

Claude Code

Included in some plans. Overages billed at standard API token rates.

A Typical Bill Breakdown

Illustrative monthly spend$5,000/mo example
35%
48%
17%
Seats (fixed)
API (variable)
Claude Code (hybrid)

Illustrative The exact split varies. API-heavy orgs see 70%+ variable; chat-heavy teams lean toward seats.

📊

Visibility

Know where the money goes: by team, product, model.

🏷

Attribution

Every dollar traceable to a workspace, key, team, or product.

🎯

Optimization

Identify waste: wrong models, redundant prompts, unused seats.

🔒

Governance

Limits, policies, rotation schedules prevent drift.

Learn · Foundations

Three Spending Streams

Anthropic bills through three mechanisms. Each has its own pricing model, monitoring tools, and optimization strategies.

How They Interact

A developer on a Team seat gets Claude Code included up to a plan allowance. Premium seats get a larger allowance (6.25x Pro vs 1.25x on Standard). But if that developer also uses API keys in a product, those variable charges stack on top of their seat cost.

Rule of thumb: Attribute seat costs to people (headcount budget) and API costs to products (product budget). Claude Code overage goes to the project budget.

Forecasting Difficulty Opinionated

Seats
Easy
API Tokens
Hard
Claude Code
Medium
Learn · Foundations

Token Economics

Tokens are the fundamental unit of API cost.

What Is a Token?

A chunk of text, roughly 3/4 of a word. A 200-word email ≈ 270 tokens. A 10-page doc ≈ 4,000-5,000 tokens.

Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)

Token Cost by Model Official

Last verified: Apr 10, 2026 against platform.claude.com/docs/en/about-claude/pricing

Input per 1M tokens

Haiku 4.5
$1
$1.00
Sonnet 4.6
$3
$3.00
Opus 4.6
$5
$5.00

Output per 1M tokens

Haiku 4.5
$5
$5.00
Sonnet 4.6
$15
$15.00
Opus 4.6
$25
$25.00
Learn · Deep Dive

Plans & Pricing

A complete view of Anthropic's current plan lineup, including billing frequency differences and Claude Code availability.

Individual Plans Official

Last verified: Apr 10, 2026 against claude.com/pricing

Free
$0
forever
  • Sonnet 4.5 only
  • ~20 msgs/day
  • No Claude Code
  • No Cowork
Pro
$20
$17/mo if annual ($200/yr)
  • All models
  • Claude Code ✓
  • Cowork ✓
  • ~5x Free usage
Max
$100-$200
5x ($100) or 20x ($200)
  • 5x or 20x Pro usage
  • Claude Code ✓
  • Higher output limits
  • Priority access

Team & Enterprise Plans Official

Last verified: Apr 10, 2026 against claude.com/pricing and support.claude.com

PlanAnnualMonthlyMin SeatsClaude Code
Team Standard$20/seat$25/seat5Included
Team Premium$100/seat$125/seat5Included
Enterprise (self-serve)$20/seat + API usage20Included
Enterprise (sales)CustomContact salesIncluded

Annual vs Monthly matters. Team Standard is $20/seat annual vs $25 monthly, a 20% difference. Team Premium is $100 vs $125, also 20%. For a 20-person team on Standard, that's $1,200/year saved by committing annually.

Claude Code Availability Official

Last verified: Apr 10, 2026 against claude.com/product/claude-code

Claude Code is available with:

  • Pro plan ($20/mo)
  • Max plan ($100-200/mo)
  • Team Standard seats ($20-25/seat)
  • Team Premium seats ($100-125/seat)
  • Enterprise plans
  • Anthropic Console / API account (pay-per-token)
  • Free plan (not available)

Enterprise: Not Just "Custom" Opinionated

Enterprise is often assumed to be a single black-box "contact sales" tier. In practice, Anthropic offers two Enterprise paths:

Self-serve Enterprise ($20/seat + API usage, min 20 seats, annual commitment): organizations can start today without contacting sales. It includes SSO, SCIM, audit logs, 500K context window, and compliance features.

Sales-assisted Enterprise (custom pricing, contact sales): for organizations needing tailored terms, usage commitments, invoicing, product bundling, and HIPAA-ready configurations. Minimum seat counts are negotiated.

The self-serve path means Enterprise isn't necessarily more expensive per-seat than Team, it's $20/seat (same as Team Standard annual) plus usage-based API charges. The value is in the governance and compliance features, not a price premium on the seat itself.

Learn · Deep Dive

API Cost Anatomy

Every API call has multiple cost components.

Input Tokens · ~60-75%

System prompt, user message, documents, history, tool definitions. Cheaper per-token but high volume.

Output Tokens · ~20-35%

Response text, thinking, tool calls, code. 3-5x more expensive per token than input.

Feature Charges · ~5-15%

Web search, code execution, fast mode (6x rates), extended caching. Not visible in token counts.

Discounts

Batch API (50% off), prompt cache reads (90% off input), cache writes (1.25x input). Check the Cost page.

Usage ≠ Cost. If tokens are flat but cost rises, someone switched models. If tokens spike but cost doesn't, caching is working. Always compare both Console pages.

Learn · Deep Dive

Claude Code Costs

CLI coding assistant with hybrid billing: included allowances plus variable overages.

Billing Model

On subscription plans (Pro, Max, Team Standard, Team Premium, Enterprise), Claude Code draws from included usage. Once exceeded, overages are billed at standard API rates for the model used. On a Console/API account, all usage is pay-per-token from the start.

Typical spend: Estimate Active developers average $6-12/day. Heavy agentic workflows reach $20-30/day. That's $100-600+/month per developer.

/cost Command

Real-time session spend in the terminal.

📈

Analytics API

Daily per-user metrics via Admin API.

🔑

Key Hygiene

Disable keys from departed team members.

Spend Limits

Set per-user caps. No limit = unbounded risk.

Diagnose

Which Plan Do I Need? Opinionated

Answer five questions to get a recommended configuration. Recommendations reflect the author's judgment, not official Anthropic guidance.

1 How many people need Claude access?

Diagnose

Console Walkthrough

The Anthropic Console at console.anthropic.com is the primary interface.

🏢 Organization Recommended Structure
Product_API

Customer-facing keys

Internal_Tools

Automation & research

Claude_Code

1 key per developer

📊 Usage Page

Token volume by workspace and key. Filter by model. Click bars for hourly detail. Export CSV monthly as your baseline.

💰 Cost Page

Dollar amounts with model pricing, feature charges, discounts. Compare with Usage to catch model switches or caching effects.

⚙ Workspaces

One per product/team. Keys in "Default" = unattributable spend. Create named workspaces and migrate keys.

🔑 API Keys

Name descriptively. Rotate quarterly. Disable when someone leaves. Never share one key across products.

👥 Spend Limits

In Settings. Set per-user caps at 2x median spend. "Unlimited" = unbounded risk.

📡 Admin API

/v1/organizations/usage_report/messages for tokens. /v1/organizations/usage_report/claude_code for Code. Group by workspace, model, time. Worth automating above $1K/mo.

Diagnose

Am I Normal? Estimates

Reference ranges for typical Claude usage. These are approximate ranges based on observed patterns, not official Anthropic data. Use them to identify whether your spend warrants investigation, not as targets.

Small consulting team (5-10 people, mostly chat)

Monthly input tokens
20-100M
0200M500M
Monthly total spend
$300-$800
$0$2,500$5,000

Mid-size product team (15-30, API + chat + Code)

Monthly input tokens
100M-500M
0500M1B
Monthly total spend
$1.5K-$5K
$0$5K$10K

Single developer using Claude Code

Monthly spend
$100-$600
$0$500$1,000
Diagnose

Common Mistakes Opinionated

Six costly anti-patterns with typical dollar impact and fixes.

1. Sending full documents when sections suffice

3-10x input token waste

A 50-page contract (60K tokens) to answer one clause question. Only ~3K tokens needed. At Sonnet rates: $0.17 vs $0.009 per request.

→ Pre-filter with embeddings or keyword search before sending.

2. Opus for tasks Haiku handles identically

5x cost, 0x quality gain

Classification, extraction, formatting rarely benefit from Opus. Same output, 5x the price.

→ A/B test model tiers on actual tasks. Document which genuinely need Opus.

3. No max_tokens cap

2-5x output spend

4,000-token response when 200 suffice. On Sonnet: $0.06 vs $0.003.

→ Set max_tokens on every production call. Match to expected response length.

4. Orphaned API keys

$50-500/mo unattributed spend

A departed developer's test integration keeps running. Nobody's monitoring.

→ Monthly key audit. Disable unused keys. Name with owner + purpose.

5. Full conversation history every request

Quadratic cost growth per conversation

Turn 20 sends all 19 prior turns. A $0.50 conversation costs $5+.

→ Conversation windowing: last N turns, or summarize older context.

6. Unlimited spend limits, no alerts

Unbounded, discovered on invoice

A looping bug runs 100x requests. First sign: $10,000 bill.

→ Set dollar caps today. Set up daily Admin API alerts. Same-day implementations.
Optimize

Optimization Levers

Ordered by typical impact, highest first.

1. Prompt Caching Official

Pricing verified Apr 10, 2026

Standard: $3.00/MTok → Cached read: $0.30/MTok (90% savings)

Cache writes cost 1.25x. Default TTL: 5 minutes. Best for high-frequency stable prompts.

2. Model Routing Opinionated

Haiku — $1/$5

Classification, extraction, formatting, validation

Sonnet — $3/$15

Code gen, doc analysis, reasoning, most production

Opus — $5/$25

Complex reasoning, nuanced writing

Opus Fast — $30/$150

Latency-critical production only

3. Batch API Official

Sonnet: $3/$15 → Batch: $1.50/$7.50 (50% off)

Non-urgent workloads. Results within 24h.

4. Prompt Engineering Opinionated

  • Trim redundant system prompt instructions
  • Set max_tokens on every call
  • JSON output when machine-parsed
  • Truncate history to relevant turns
  • Send needed sections, not full documents
Optimize

Cost Calculator

Compare before/after optimization scenarios.

Multi-Model API Estimator

Total input/mo
M tokens
Total output/mo
M tokens

Model mix (must total 100%):

Haiku ($1/$5)
%
Sonnet ($3/$15)
%
Opus ($5/$25)
%

Before

Batch %
Cache %
$1,200

After

Batch %
Cache %
$540

Monthly Savings

$660 (55%)

Seat Cost Estimator Official

Using annual pricing. Last verified Apr 10, 2026.

Team Standard
× $20/seat (annual)
Team Premium
× $100/seat (annual)
Pro (individual)
× $17/user (annual)

Monthly Seat Cost (annual pricing)

$800

Optimize

FinOps Maturity Model Opinionated

Stages of AI cost management capability.

Stage 1 · Awareness

You know you're spending

Can see total spend but can't attribute to teams/products.

Stage 2 · Visibility

Every dollar attributed

Keys mapped, workspaces aligned, monthly reviews, limits set. Target: 2-4 weeks.

Stage 3 · Optimization

Actively reducing waste

Caching, routing, batch, seat right-sizing. Target: 2-3 months.

Stage 4 · Governance

Self-managing through process

Automated alerts, budget owners, routine reviews, playbook. Target: 4-6 months.

Getting Started

  • Day 1: Set spend limits (not Unlimited)
  • Week 1: Audit keys, disable unused, rename vague
  • Week 1: Export Usage CSVs as baseline
  • Week 2: Investigate spiky days, create named workspaces
  • Week 3: Review seat utilization, evaluate caching
  • Month 2: Document routing policy, set up alerts
  • Month 3: Write the FinOps playbook
Learn · Reference

Glossary

MTok
Million tokens. Standard API billing unit. 1 MTok ≈ 750K words.
Input Tokens
Everything sent: prompt, documents, history, tool defs.
Output Tokens
Everything generated: response, thinking, tool calls. 3-5x input price.
Context Window
Max tokens per request. 200K (Team), 500K (Enterprise), 1M (Opus 4.6).
ITPM / OTPM
Input/Output Tokens Per Minute. Rate limit metrics in Console.
TTL
Time To Live. Cache expiry. Default 5 min.
Prompt Caching
Stores repeated content. Reads cost ~10% of standard input.
Batch API
50% cheaper. Results within 24h. Non-urgent workloads.
Workspace
API key container in Console. Group/filter for cost attribution.
Admin API Key
sk-ant-admin... format. Org-level usage/cost data. Admin-only.
Spend Limit
Per-user dollar cap beyond subscription. In Console Settings.
Fast Mode
Opus only. 6x rates ($30/$150). Lower latency.
max_tokens
API param capping response length. Essential for output cost control.
Model Routing
Directing tasks to cheapest adequate model. Haiku→Sonnet→Opus.
Cache Hit Rate
% of input tokens read from cache. Higher = cheaper.
Extra Usage
Pay-per-token charges when subscription limits are exceeded.

Pricing data verified against claude.com/pricing and platform.claude.com/docs on April 10, 2026. Anthropic may change pricing at any time. Always verify before making purchasing decisions.