How to Cut Your Claude Code Bill in Half
>This covers the 6 token sinks and how to plug them. Claude Code for Beginners goes deeper with a full 48-hour build system, including the screenshot loop, security audit, and daily workflow checklist.

Summary:
- Six specific token sinks drain your Claude Code budget. This teaches you to plug each one.
- Includes the 3-line prompt formula that cuts token waste by 40%.
- Real before/after token counts: 147,000 vs 62,000 for the same feature.
- Copy-paste effort level function and restart-vs-continue decision tree.
My first month with Claude Code cost me $1,400. Not a typo. I ran max effort on everything, let conversations hit 100+ messages, and asked Claude to “read my whole project” three times a day at 80,000 tokens each read. The code I shipped was about 2,000 lines. The context Claude processed to get there? Over 218,000 lines of waste.
After learning what actually burns tokens, my cost per line dropped from $0.69 to $0.08. Those are my numbers on my project. Your mileage varies with codebase size and task complexity, but the fixes are universal. Here’s every one.
How do tokens actually work in Claude Code?
Token is roughly 4 characters of English text. A line of code runs 10-20 tokens. A full page of text hits 250-300.
Every interaction has two costs: input tokens (everything Claude reads: your prompt, files, command output, conversation history) and output tokens (everything Claude generates). Input tokens are cheaper per unit but there are way more of them.
The biggest cost isn’t your prompts. It’s file reading. When Claude reads a 500-line file, that’s 5,000-7,000 input tokens. Read 20 files to understand your project and you’ve burned 100,000+ tokens before a single line of code gets written.
From the Claude Code docs:
| Metric | Value |
|---|---|
| Average daily cost (API users) | $6/developer/day |
| 90th percentile daily cost | Under $12/day |
| Team monthly average (Sonnet) | $100-200/developer |
| Background process overhead | Under $0.04/session |
| Agent teams multiplier | ~7x standard sessions |
What are the 6 token sinks and how do you fix them?
Sink 1: Conversations that run too long
Every message includes ALL previous messages in the context window. Message 1 costs X tokens. Message 50 costs X + Y + everything before it. After 20-30 messages, you’re spending more on history than on work.
Fix: Start a new conversation every 15-20 messages. Open with a one-sentence summary:
I'm working on the Task Tracker app. Just finished the status toggle feature.
Next: add filtering by status. Files: src/app/page.tsx and
src/components/TaskList.tsx.
Thirty seconds of typing saves hundreds of thousands of tokens.
Sink 2: Wrong effort level
The difference between low and max effort can be 3-5x for the same task. A CSS class change on max effort costs the same as building an entire component on medium.
Fix: The plain-English version: if Claude could do it with copy-paste, use low. If Claude needs to think, use medium. If Claude needs to plan across multiple files, use high. In code:
def pick_effort(task_description: str) -> str:
"""
Effort level decision tree for Claude Code.
Copy this logic. Check before every prompt.
Returns: 'low', 'medium', or 'high'
"""
# Mechanical: 1 file, simple change
if task_description in [
"rename variable", "add a line", "update config",
"fix typo", "add import", "change CSS class"
]:
return "low" # /effort low
# Architectural: 5+ files, system-wide impact
if any(keyword in task_description for keyword in [
"design system", "refactor module", "debug across",
"major restructure", "new data model"
]):
return "high" # /effort high
# Everything else: features, bugs, tests
return "medium" # default, leave it
# Examples:
print(pick_effort("rename variable")) # low
print(pick_effort("build search feature")) # medium
print(pick_effort("refactor auth module")) # high
One second before each prompt. Massive savings.
Sink 3: Claude reading files it doesn’t need
Say “add a button to the task list” and Claude reads your entire components directory, utils, and database models. It only needed one file.
Fix: Name the files. “Add a delete button to the task list in src/components/TaskList.tsx. The delete action should call the deleteTask server action in src/app/actions.ts.” Two files instead of twelve.
Add a file map to your CLAUDE.md:
## File Structure
- src/app/page.tsx: main page, renders TaskList
- src/components/TaskList.tsx: task list UI component
- src/components/TaskForm.tsx: new task form
- src/app/actions.ts: server actions for CRUD
- prisma/schema.prisma: database schema
Claude reads fewer files because it knows where things are. Pays for itself within two conversations.
Sink 4: Not using code intelligence plugins
Without code intelligence, Claude finds functions by reading file after file. With it, Claude asks “where is deleteTask defined?” and gets an instant answer.
One Reddit post titled “Enable LSP in Claude Code: code navigation goes from 30-60s to 50ms with exact results” got 862 upvotes on r/ClaudeCode. Not exaggerated. Code intelligence reduces file reads by 60-80% on medium-sized projects.
Fix: For TypeScript projects, install the TypeScript compiler (npm install -D typescript) and ensure your tsconfig.json exists. Claude Code detects it automatically. For Python, install pyright. For Go, gopls. For Rust, rust-analyzer.
Sink 5: Verbose prompts
“I was thinking that maybe we could explore the possibility of adding some kind of functionality that would allow users to potentially filter their tasks.” That’s 27 words containing 4 words of instruction: “add task filtering.”
Fix: The 3-line prompt formula:
Line 1 (action): Add a dropdown filter above the task list.
Line 2 (scope): Modify src/components/TaskList.tsx.
Line 3 (constraint): Options: All, Todo, In Progress, Done.
Under 30 words. Claude has everything it needs. Compare that to the 72-word version and you cut input tokens nearly in half.
Sink 6: Asking Claude to explain instead of do
“Can you explain how the auth system works and then suggest improvements?” makes Claude write a multi-paragraph explanation AND a suggestion. You’re paying for both.
Fix: If you want changes, ask for changes. “Improve the auth system: add rate limiting on login attempts, add password complexity requirements, switch from JWT to httpOnly cookies.” Claude makes the changes. Review the code. Ask for explanations only when you genuinely need them.
What broke when I ignored this?
Week 1: everything on max effort, 80,000 tokens per orientation read, three times a day. That’s 240,000 tokens daily just on “read my project.” Cost: ~$400.
Week 2: five-paragraph prompts for every feature. Claude would read my essay, do something different, then I’d write another essay correcting it. Features that should take one conversation took four. Cost: ~$500.
Week 3: debugging in circles. Same conversation, 100+ messages deep. Claude kept suggesting fixes it had already tried. Two full days going nowhere. Cost: ~$450.
Total: $1,387.42 for 2,000 lines of code. After applying all six fixes, my next project of similar scope cost roughly $170. Your savings depend on your project, but the pattern holds: most of what you pay for is context overhead, not useful work.
How do you measure the actual difference?
Run this test on any project. Same task, two approaches:
UNOPTIMIZED (old habits):
Input tokens: 147,000
Output tokens: 8,200
Files read: 14
Time: 45 seconds
OPTIMIZED (all 6 fixes):
Input tokens: 62,000
Output tokens: 6,100
Files read: 3
Time: 18 seconds
Same feature. Same result. 58% fewer input tokens. The optimized version is also faster because Claude reads fewer files.
When should you restart vs. continue a conversation?
| Situation | Action |
|---|---|
| Finished a feature, starting new one | Restart |
| Claude’s responses getting confused | Restart |
| Context bar past 60% | Restart |
| Switching to different area of codebase | Restart |
| Mid-debug on a specific issue | Continue |
| Current task continues the previous one | Continue |
| Iterating on design with screenshot loop | Continue |
The cost of restarting: 5,000-10,000 tokens to re-explain context. The cost of NOT restarting: 100,000+ tokens of dead weight per message. The math always favors restarting.
Before you restart, ask Claude: “Summarize the current state of the project in 3-4 sentences.” Copy that summary into the new conversation. Perfect context in 100 tokens instead of 5,000.
What should you actually do?
- If you’re on the Pro plan ($20/mo): apply these fixes and you stop hitting the rate limit every afternoon. Expect 3 extra productive hours per day.
- If you’re on the Max plan ($100-200/mo): match effort levels to tasks and restart conversations after each feature. You get a full workday of building without hitting limits.
- If you’re on the API plan: the conversation length fix alone (Sink 1) saves $500-800/month for heavy users. Combined with effort levels and specific file references, expect to cut total spend by 40-50%.
bottom_line
- Token waste is the default Claude Code experience. Six fixes eliminate it: fresh conversations, right effort levels, specific file references, code intelligence, short prompts, action-oriented requests.
- The savings compound every day. Over a year, that’s hundreds or thousands of dollars for API users, and hours of recovered time for subscribers.
- The common thread: you don’t get more by spending more. You get more by spending smarter. A $0.08/line workflow beats a $0.69/line workflow on every metric.
Frequently Asked Questions
How much does Claude Code cost per month?+
Anthropic's docs report an average of $6/developer/day for API users, with 90% staying under $12/day. Pro plan is $20/month with rate limits. Max plans run $100-$200/month.
Why does Claude Code keep re-reading my files?+
Context eviction. When a conversation gets long, older content drops out. Claude re-reads files because it literally forgot them. Fresh conversations and specific file references fix this.
What effort level should I use in Claude Code?+
Low for mechanical edits (rename, config change). Medium for features and bugs. High/max only for architecture decisions spanning 5+ files. Wrong effort levels waste 3-5x tokens on simple tasks.
More from this Book
5 Claude Code Mistakes Every Beginner Makes
Five mistakes that waste hours in Claude Code and the fix for each. Includes before/after prompts, a CLAUDE.md template, and the 3-line prompt formula.
from: Claude Code for Beginners
How to Use Claude Code Dispatch and Remote Control
Reference guide to Claude Code's power features: dispatch, remote control, channels, and LSP. Includes the adoption timeline and dispatch planning method.
from: Claude Code for Beginners
How to Fix Ugly AI-Generated UIs with Claude Code
The screenshot verification loop that fixes ugly AI-generated UIs in Claude Code. Four steps, three rounds, eight minutes to a UI that doesn't embarrass you.
from: Claude Code for Beginners
The Vibe Coding Security Checklist (20 Points)
A 20-point security audit checklist for AI-generated code. Covers auth, input validation, data access, and infrastructure with copy-paste grep commands.
from: Claude Code for Beginners