In one line: Identity isn’t defined once and done — it expires.
Identity Expires: The Story of Em’s Refactor
Everything we’ve discussed so far has been about “how to define identity,” but one thing stood out during the building process: identity isn’t defined once and done — it expires.[3]
Em was the first Agent I built. At the time, I hadn’t yet accumulated all that methodology knowledge, hadn’t written the Context Engineering principles in the wiki, hadn’t established the rule that “CLAUDE.md should be under 60 lines.” So Em’s first version of CLAUDE.md was 99 lines — stuffed with identity definition, directory structure explanations, a 7-step ingest workflow, a 5-step query workflow, a 6-step lint workflow, plus conventions and a page format template. All 99 lines of it, everything loaded into CLAUDE.md, everything loaded at the start of every session.[5]
Later, Em’s own wiki accumulated a series of methodology knowledge — context-engineering, methodology-playbook-plan, skill-system, and other concepts. When I turned those methodologies back to examine Em’s own architecture, I found four places where it was violating its own principles:
- CLAUDE.md at 99 lines was too long
- Workflow steps (HOW-level knowledge) were sitting in CLAUDE.md — they belong in Skills or Commands
- All 3 workflow sets were loaded every session, violating the JIT loading principle
- There were no
.claude/skills/or.claude/commands/directories at all
Em was violating the very principles recorded in its own wiki. There’s something almost amusing about that — like a doctor ignoring their own medical advice. The knowledge management Agent’s architecture didn’t conform to the architectural principles recorded in the knowledge it managed.
So I had Agent Dm do a refactor. First, Dm’s Architect read the three relevant concepts from Em’s wiki and reviewed the full problem analysis document, then designed a refactor plan. Since the goal wasn’t to rebuild Em’s knowledge architecture, Em’s wiki content, raw directory, and METHODOLOGY.md all stayed untouched. The only change was how identity gets loaded — switching from eager to JIT:
| What was in CLAUDE.md | Where it lives after refactor |
|---|---|
| Identity + Core Rules | Stays in CLAUDE.md (this is always-on) |
| Ingest workflow (7 steps) | .claude/commands/ingest.md |
| Query workflow (5 steps) | .claude/commands/query.md |
| Lint workflow (6 steps) | .claude/commands/lint.md |
| Page format + conventions | .claude/skills/wiki-schema/SKILL.md |
Ingest workflow 7 steps
Query workflow 5 steps
Lint workflow 6 steps
Page format + conventions"] ALL --> W1["Available Context ▼
99 lines occupied"] end subgraph AFTER["After: JIT Loading — 47 lines"] S2["Session Start"] --> L2["Load CLAUDE.md"] L2 --> ID["Identity + Core Rules only"] ID --> W2["Available Context ▲
47 lines at start"] W2 -->|"/ingest"| CMD1["Load ingest.md"] W2 -->|"/query"| CMD2["Load query.md"] W2 -->|"/lint"| CMD3["Load lint.md"] end style ALL fill:#c67a50,color:#fff style ID fill:#6b8f71,color:#fff style W1 fill:#c67a50,color:#fff style W2 fill:#6b8f71,color:#fff
After the refactor, CLAUDE.md dropped from 99 lines to 47. Em’s behavior didn’t change at all — it’s the same Agent, doing the same work with the same workflows. The only difference is that it no longer loads all three workflows at the start of every session. Now it loads the ingest workflow when /ingest is called, the query workflow when /query is called.
The core of the identity didn’t change — what changed was how the identity is loaded. And that change cut the Context overhead at session start from 99 lines to 47, freeing up space for what actually needs it — like reading more wiki pages to make better cross-references.
Around the same time, C7 went through a similar capability upgrade. When C7 was first built in Phase 2 as C7-minimal, it had 49 lines in CLAUDE.md, 2 own skills, and 3 commands. By Phase 5, C7 had added 3 new own skills and 3 new commands — yet CLAUDE.md only grew from 48 lines to 53. Because the new capabilities all went into skills and commands, not into CLAUDE.md.
Em was about slimming down identity (99→47), C7 was about expanding capabilities (2 skills→5 skills) — but the logic behind both was the same: CLAUDE.md only holds “who I am” and “how I make decisions.” “What I can do” lives in commands. “What I know” lives in skills. Load when needed, don’t occupy Context when not.
This echoes what I wrote in the refinement cycle article: a good system isn’t designed right the first time — it’s continuously refined through use. Capabilities and identity work the same way. You can’t know in the design phase how many lines Em’s CLAUDE.md should have. You can only discover it in actual use — noticing what gets loaded but never used, and what’s needed but isn’t in Context — and then adjust.
The Real Cost of Identity Confusion
Back to that “are you Agent GM or Agent G?” scenario from the beginning. The consequences of that identity confusion weren’t an error or a work stoppage — quite the opposite. The Agent carried on smoothly. It just did the work using the wrong identity. That session ran for only 4 minutes, ending with a small commit. But that small commit was made using Agent G’s logic, not Agent GM’s.
gets loaded?"} LOAD -->|"Correct"| GM["GM Identity
Conversation before dispatch"] LOAD -->|"Confused"| G["G Identity
Build fast"] GM --> GM_Q["First asks: Why this change?
Which workflow is it part of?"] GM_Q --> GM_R["Informed Decision ✓"] G --> G_A["Executes directly
4 min → commit"] G_A --> G_R["Looks correct ✗
Wrong decision framework"] G_R --> HIDDEN["No error messages
Only discovered later"] style GM_R fill:#6b8f71,color:#fff style G_R fill:#c67a50,color:#fff style HIDDEN stroke:#c67a50,stroke-dasharray:5 5
What’s the difference? Agent G is a bootstrap agent — its mindset is “build things fast.” GM’s mindset is “conversation before dispatch, make sure you understand before you act.” For the same change: if GM handled it, it would first ask “what’s the reason for this change? Which workflow currently in progress does it relate to?” Agent G would just start executing.
The cost of identity confusion isn’t “doing the wrong thing” — it’s “doing a right-looking thing with the wrong decision framework.”[9] The latter is more dangerous, because you won’t notice immediately. You’ll gradually realize somewhere down the line that a certain decision’s logic doesn’t add up, then trace it back to that 4-minute session and discover the root cause was a wrong identity load.
This is also why GM’s design includes one explicit rule: “GM’s CLAUDE.md must have zero references to Agent G.” Not just not inheriting Agent G’s build context — not even referencing it. Because GM needs to think from its own identity. If there’s any residual trace of Agent G in its CLAUDE.md, it might revert to Agent G’s behavior patterns in certain contexts. And that reversion happens silently — no error message will ever tell you “I’m currently using the wrong identity.”
Identity Is the Foundation of Context
We’ve now reviewed the identity design for all five Agents. But what I want to emphasize isn’t the technical detail of “every Agent needs a CLAUDE.md” — that part is easy to do. What I want to say is something more fundamental: throughout the entire process of building the Agent Team, almost every problem I traced back to its source had something to do with identity.
- Inconsistent quality in the wiki pages Em produced? Traced back to Em’s CLAUDE.md not having a clear definition of the judgment criteria for cross-references.
- C7’s skill intake validation being too permissive? Traced back to C7’s skill-contract.md not covering certain edge cases.
- GM being too eager to dispatch? Traced back to the identity definition not emphasizing “conversation before dispatch” strongly enough.
Identity is the foundation of Context. If you make a vague definition at the identity layer — say, “you’re a knowledge manager” without explaining the classification system for that knowledge — that vagueness ripples outward into every downstream decision. The Agent won’t ask you “what exactly do you mean by knowledge management?” It will act based on its own understanding of “knowledge management,” and the gap between its understanding and yours is the time you’ll spend later on corrections and recalibration.
This is also why the previous article said “HANDOFF.md is the first output” — and why this article’s point is: CLAUDE.md is the most important output. A good HANDOFF.md lets a new session get up to speed quickly. But a good CLAUDE.md lets an Agent make better, more correct decisions across its entire lifecycle. The former is a one-time boost; the latter is continuous quality assurance.
Perhaps the thing an Agent Team needs to invest the most time in isn’t architecture design, isn’t communication protocols, isn’t memory systems — it’s answering the same question over and over: “Who exactly is this Agent?” And every time you feel like you’ve fully answered that question is exactly when you should revisit the answer.
References
[3] Preprints.org — Detecting and Repairing Role Drift in Multi-Agent Collaboration (RoleFix framework, defines role drift taxonomy) https://www.preprints.org/manuscript/202603.0348
[5] Hugging Face Forums — Solving Agent System Prompt Drift in Long Sessions (system prompt attention weight decreases as context grows) https://discuss.huggingface.co/t/solving-agent-system-prompt-drift-in-long-sessions-a-300-token-fix/173787
[9] GitHub Issue #32721 — Teammates Receive “Claude Agent SDK” Identity, Not “Claude Code” (real-world bug instance of identity misconfiguration) https://github.com/anthropics/claude-code/issues/32721
Support This Series
If these articles have been helpful, consider buying me a coffee