Felipe Figueiredo
← All Writing

Prompt Engineering · Case Study

CLAUDE.md Token Optimization

How I applied systematic diagnosis and iterative refinement to transform a verbose AI instruction file into high-density directives — eliminating noise without losing functionality.

42
Initial Score
96
Final Score
−59%
Tokens Saved
3
Iterations

Score via /refine — rates clarity · completeness · efficiency · goal alignment

01

The Problem

The CLAUDE.md is loaded on every conversation turn with the model. Every unnecessary token is a cost that multiplies across hundreds of interactions. The original file mixed personal narrative, user documentation, and justifications — none of which instruct behavior.

CLAUDE.md — original version · ~220 tokens
1# HQ — Quartel General
2This is Felipe Figueiredo's central life management space...
3 
4## Context
5- Email: user@example.com
6- Main environment: Claude Code via terminal...
7- Why here and not the browser: access to MCPs...
8 
9## Work Preferences
10- Short, direct responses — no fluff
11- Portuguese by default
12 
13## How to Use This Space
14Each area can have its own subfolder as needs grow...
Descriptive narrative — the model doesn't need to know the project's history. Profile context already lives in persistent memory.
Justifications — explaining why a preference exists doesn't instruct model behavior.
User documentation — “How to Use This Space” is a README. It is not a directive.
Memory duplicates — email and profile context already exist in the memory system, making these lines pure overhead.
02

The Iterations

Iteration 01

65

Removal of all narrative, justifications, and duplicates. Preferences converted into direct, compact directives.

First cut: “## How to Use This Space” — describes the tool, doesn't instruct behavior.

− narrative− justifications− user docs

Iteration 02

84

Added workflow directives mapping the sequence of available skills and tools. Explicit instruction about the persistent memory system.

Key add: explicit skill sequence — model stops inferring the right tool per task.

+ workflows+ skills map

Iteration 03 — Final

96

Skill sequence clarified, zero memory redundancies, ultra-compact format with maximum instruction density.

Final cut: profile and stack moved to memory — CLAUDE.md left with pure directives only.

+ explicit sequence+ zero redundancies

Score Progression

42
initial
65
v1
84
v2
96
final
03

Final Result

Before — ~220 tokens

# HQ — Quartel General
This is the central management space...
## Context
- Email: user@example.com
- Main environment: Claude Code...
- Why here and not the browser:...
## Work Preferences
- Short, direct responses
- Portuguese by default
## How to Use This Space
Each area can have its own subfolder...

After — ~90 tokens

# HQ
 
- Language: Portuguese always
- Tone: Direct. No summaries, no emojis.
 
## Workflows
- New project → brainstorming required
- Approved → writing-plans
- Implementation → executing-plans
- Review → requesting-code-review
 
## Technical context
- Profile and stack → in persistent memory.
04

Key Principles

P-01

CLAUDE.md ≠ README

The file is read by the model every turn, not the human. Every sentence must instruct behavior — not describe context.

P-02

Memory stores, CLAUDE.md activates

Static context (profile, history, stack) goes into persistent memory. CLAUDE.md contains only active behavior directives.

P-03

Explicit workflows beat implicit ones

Mapping the skill and tool sequence guarantees consistency without relying on repeated verbal instruction every session.

P-04

Tokens accumulate — always measure

220 tokens × 500 daily sessions × $0.003/1k ≈ $0.33/day — ~$120/year. At Opus pricing, multiply ×5. Context optimization is an engineering discipline, not just an aesthetic choice.

Building something AI-native?

I'm a senior fullstack engineer available for remote work — from architecture to prompt engineering.