How to read `/usage --json`: 5 fields, one ratio that decides whether you migrate
If you have been using Claude Code on a Max or Pro subscription through April 2026, you have probably opened your terminal at the end of a workday and typed something like claude /cost to find out where your tokens went. The output is a wall of numbers. Some of those numbers tell you whether your current plan still fits, whether the cache TTL regression is now driving your bill, and whether the next migration decision is two weeks away or six months away.
Most of them are noise. Five of them matter, and one ratio derived from two of those five tells you almost everything you need.
This post is the cheat sheet for those five fields and the one ratio. No tool. No signup. Five minutes with a recent JSON output and you have your decision band.
Why /usage --json, and not the dashboard
The dashboard at claude.ai/settings/usage shows a percentage. The percentage is good for a quick sanity check; it is not enough to forecast a 30-day bill or to decide whether you should migrate platforms. The JSON output, available via claude /usage --json from inside any Claude Code project (on v2.1.118 or later), gives you raw token counts that the percentage hides.
Older Claude Code versions can use claude /cost --json instead. The fields are a subset of the newer output but the analysis below still works. The newer command name is the one that will stick going forward — Anthropic merged /cost into /usage in v2.1.118.
Pipe it to a file once a week:
claude /usage --json > ~/usage-$(date +%Y-%m-%d).json
A folder of these snapshots is worth more than any third-party dashboard for the analysis below.
The five fields that matter
Inside the JSON, under .totals, there are several dozen fields. The ones below are the only five that change a decision.
jq '.totals | {input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd}' usage-snapshot.json
1. input_tokens
The total tokens Claude Code processed as input during the snapshot window — the prompts you typed, the files Claude read, the context it pulled in, the system prompt. This is the number you intuitively want to keep low and yet have very little control over once Claude starts retrieving files into context.
What it tells you on its own: not much. What it tells you in combination with output_tokens and the cache fields: a lot.
2. output_tokens
What Claude actually wrote back. Code edits, explanations, tool calls. Output tokens are typically 2–10× more expensive per token than input tokens at the API tier — the exact ratio depends on which model you are using.
If your output_tokens per session look small relative to input_tokens, you are in the read-heavy band that benefits most from prompt caching. If they look large, your sessions are write-heavy and the cache discussion below matters less.
3. cache_read_tokens
Tokens that Claude already had in its prompt cache and read back at roughly 10% of the cost of fresh input tokens. This is the field that should be growing as your session accumulates. A healthy long-session conversation has cache_read_tokens rising session-over-session and cache_creation_tokens staying small.
4. cache_creation_tokens
Tokens that Claude wrote into the cache — paid at roughly 1.25× the cost of fresh input tokens. A new conversation always pays cache_creation cost for the first turn, because there is nothing to read back. After that, well-cached sessions keep cache_creation_tokens low.
A session that keeps recreating its cache instead of reading from it is paying the cache-bust premium on every turn. This is the single most important field for the migration decision.
5. cost_usd
The dollar-equivalent cost of the work in the snapshot window, computed against API pricing. Important caveat for Max/Pro subscribers: as of v2.1.118, this field reports an API-equivalent dollar figure even on fixed-price subscription plans (Issue #52365). Your actual monthly bill on Max 5×, Max 20×, or Pro is the subscription sticker price, not this number.
Read cost_usd as "what this work would have cost on metered API." For a fixed-price subscriber, that is the reference number for the migration comparison — what you would pay if you switched to API billing or to another platform that bills per token.
The one ratio that decides
Once you have those five fields, the migration decision turns on one derived value:
cache_creation_ratio = cache_creation_tokens / (cache_creation_tokens + cache_read_tokens)
This is the share of your prompt-cache traffic that is being recreated rather than read. It is the most legible signal for whether the cache TTL regression Anthropic shipped in early March 2026 is materially affecting your bill.
The regression: in early March, the prompt-cache time-to-live silently shortened from sixty minutes (ephemeral_1h) to five minutes (ephemeral_5m). The community-scale measurement landed in GitHub Issue #46829 on April 12: 119,866 API calls across two machines over 92 days, showing zero ephemeral_5m tokens through February 1 – March 5, then a transition March 6–7, then ephemeral_5m dominant from March 8 forward. On April 12 the issue was closed not planned — the regression is now part of the platform.
That status change is what makes this ratio so important. There is no upstream fix coming. Whatever your cache_creation_ratio is right now is, approximately, what it will continue to be unless you fortify locally.
The three bands
| Band | Ratio | Reading |
| HEALTHY | < 0.15 | Cache is behaving close to the pre-regression baseline. No fortification action required this month. |
| WATCH | 0.15 – 0.40 | Some cache-bust premium is in your bill, but you are not yet in the migration-trigger band. Re-check weekly. If the ratio drifts upward two weeks running, treat it as TRIGGER. |
| TRIGGER | > 0.40 | The regression is materially driving your bill. Local fortification matters more than waiting for an upstream fix that is not on the roadmap. |
Anthropic engineer Jarred Sumner (Bun runtime creator, now at Anthropic) responded in the issue thread that the 5-minute TTL is in fact cheaper for the meaningful share of Claude Code requests that are one-shot calls where cached context is used once and not revisited. That argument is technically correct on average — and it is the reason the issue was closed not planned. It is also why the individual subscriber whose workload is not one-shot — long iterative sessions, repeated file reads, multi-turn refactors — sees their cache_creation_ratio climb past 0.40 and feels the regression in their quota.
If your ratio is in the WATCH or TRIGGER band, the band tells you what kind of decision you are making, not just what kind of bill you have:
- HEALTHY → no decision pending; keep auditing monthly
- WATCH → consider lightweight local fortification; the migration evaluation is at the edges of relevant
- TRIGGER → the migration evaluation is now, and the inputs to that decision are your daily burn (in
cost_usd ÷ days) and your platform preferences
Worked example
Here is a 14-day snapshot from a real Max 20× subscriber's /usage --json output:
input_tokens: 5,200,000
output_tokens: 1,100,000
cache_read_tokens: 8,900,000
cache_creation_tokens: 3,400,000
cost_usd: 868.00
Derived:
- Daily token rate: (5.2 + 1.1 + 3.4)M ÷ 14 = ~693k tokens/day
- cache_creation_ratio: 3,400,000 ÷ (3,400,000 + 8,900,000) = 0.276
- API-equivalent daily spend: $868 ÷ 14 = $62/day
Reading: ratio 0.276 is in the WATCH band — above the healthy 0.15 floor, comfortably below the 0.40 trigger threshold. The subscriber is paying some cache-bust premium but not the full regression tax.
API-equivalent daily spend $62/day means the work being done would cost roughly $1,860/month if billed through the API at posted rates. The actual bill is the Max 20× sticker of $200/month — which means this subscriber is getting good value out of the subscription even with the cache-creation cost premium baked in. The 0.276 ratio is not enough on its own to justify a migration.
If the same subscriber's ratio had read 0.42 instead of 0.276 — same total tokens, but a different cache-creation/cache-read split — the read becomes very different. The plan's $200 sticker still applies, but the work is now sitting at the edge of the band where alternative paths (Path C hybrid DIY at ~$120–180/mo for the same workload) start to dominate on cost alone. The decision then depends on whether the subscriber has the operator discipline for Path C — but the trigger to consider it is the ratio, not the dollar figure.
What to do next
Three concrete moves, ordered by urgency:
1. Run the audit yourself. Pipe a recent /usage --json to a file and compute the ratio. If it is HEALTHY, set a calendar reminder for two weeks from now and move on. If it is WATCH, do the same but also start a weekly habit. If it is TRIGGER, the next two moves matter.
2. Install the local fortification. The hook stack that absorbs this class of regression locally — token-budget-guard, claude-update-budget-guard, cch-sentinel-precommit-guard, quota-reset-cycle-monitor, model-version-alert, tokenizer-ratio-alert — ships in the open-source cc-safe-setup repository. Each lives in examples/ and installs individually via npx cc-safe-setup --install-example <name>. The base install (npx cc-safe-setup) covers the core data-loss guards; the six above are the regression-specific overlay you add on top. (Note: npx cc-safe-setup --opus47 installs a separate four-hook set targeting the Opus 4.7 safety-classifier and credential issues — different from the six listed here.) Total time about ten minutes including the CLAUDE.md anti-reinvention prelude that closes the related Issue #52893 reinvention pattern.
3. Pin your model. Set ANTHROPIC_MODEL=claude-opus-4-6 in your shell profile if Opus 4.7 is producing the anxiety/regression pattern documented in the 4/16+ Reddit threads. The model-version-alert.sh hook will warn you if a session handshake reports a model that does not match your declared pin. This is two lines of configuration that absorb most of the Opus 4.7 tokenizer-inflation cost on workloads that do not specifically need 4.7's behavior.
If those three are already in place, the move that compounds them is measurement. Keep a 14-day rolling log of your cache_creation_ratio and your daily spend. The decision should I migrate? is more legible from your own log than from any third-party guide, including this one.
What this post is not
It is not a Cursor pitch. It is not an Anthropic apology. It is the operating manual for the five fields and one ratio that the regression made matter. The fields and the ratio are the same on every Claude Code subscription tier; the bands are the same; the local fortification is the same.
If your ratio is HEALTHY this month, you do not need any of the rest. If it is WATCH, set the reminder. If it is TRIGGER, read your numbers, install the hooks, and check back in two weeks. The decision is yours. The numbers are already in your terminal.
Posted 2026-04-26. Source horizon 2026-04-25. The cache TTL regression status (closed not planned) and the v2.1.118 /usage --json field set are the binding evidence.
If you found the audit useful and want a 30-day cost forecasting worksheet that takes the same five fields into Section A and produces 30-day cost projections for each migration path in Section C, there is a stand-alone Markdown worksheet here — free, no signup, no product pitch, same author.
If the worksheet alone does not resolve the decision and you want the full framework around it — the five migration triggers, the three migration paths (stay-and-fortify, switch platforms, stack of tools), and the ninety-page treatment of when each one applies — the Migration Playbook (Edition 1, $19, live since 2026-04-25) is the longer companion the worksheet was extracted from.