Agent Team in Practice (3): Identity — Every Agent Needs to Know Who It Is

Contents

Identity Is Not a Single System Prompt
Five Agents, Five Identity Designs
Em: The Wiki Structure Defines the Thinking Style
C7: The One Canonical Location Rule
Dm: Three Roles in One Identity
G7: The Sacred Boundary
GM: The Last to Take Shape, Because Its Identity Is the Most Complex

Five Agents standing together, question marks on their name badges — Are you Agent GM or Agent G?

In one line: An Agent’s identity isn’t a single prompt — it’s an entire directory structure.

“Are you Agent GM or Agent G?”

That’s a line I actually said out loud to an Agent one evening while running several Claude Code sessions at the same time (voice input — that was the entire first prompt of the session). Not some carefully crafted instruction. Just that: in the middle of all the shuffling and switching that comes with building Agents, I genuinely wasn’t sure which CLAUDE.md this particular Agent had loaded. Who did it think it was right now? What tools did it have access to? If even the human can’t tell which Agent is which, can the Agent itself?

The answer is: if you haven’t defined it clearly, it won’t know either. And it won’t tell you it’s confused — it will just use whatever identity it happened to load, apply that identity to the work you expected a different identity to handle,[2] and produce something that looks almost right but feels slightly off. You’ll spend time tracking down the problem before realizing the issue was at the very beginning — not its reasoning or execution, but the identity that was loaded.

The previous article covered why we build an Agent Team. This one covers the first problem you run into after building it: identity.

Identity Is Not a Single System Prompt

In Subagent mode, identity isn’t a problem — Subagents don’t have identity. A Subagent is just an extension of the parent Agent, does its job, then disappears. It doesn’t need to “know who it is.” But inside an Agent Team, each Agent is an independent, persistent entity with its own area of expertise, and identity becomes the very first question that has to be answered.

The most intuitive approach is to add a line to the System Prompt: “You are Agent Em, responsible for knowledge management.” But that’s nowhere near enough — because an Agent’s behavior isn’t determined by a single line. It’s determined by the entire Context: what files live in its working directory, what Skills it can read, what Commands it’s been configured with, and how its CLAUDE.md describes its thinking framework.[1]

Back in the Context Engineering article, I argued that Context is what determines Agent behavior. That argument gets much more concrete in an Agent Team: identity is the first layer of Context. If this layer is off, it doesn’t matter how carefully you design the dispatch, memory, and coordination layers downstream — the Agent will interpret every instruction it receives through the wrong identity, and everything downstream will be off too.

So in practice, the identity I designed for each Agent isn’t a sentence — it’s an entire directory structure:

Component	Example (Em)	What It Tells the Agent
CLAUDE.md	`agent-Em/CLAUDE.md`	Who you are, how you think
Methodology doc	`agent-Em/METHODOLOGY.md`	The principles you use to make judgments
Commands directory	`agent-Em/.claude/commands/`	What you do (ingest, query, lint)
Skills directory	`agent-Em/.claude/skills/`	What you know (wiki format specs)
Working directories	`agent-Em/wiki/`, `agent-Em/raw/`	Where your stuff lives, where others’ input arrives

Em’s directory structure is its identity: CLAUDE.md tells it “who you are and how you think,“[4] the methodology doc tells it “the principles you use to make judgments,” the commands directory tells it “what you do” (ingest, query, lint), the skills directory tells it “what you know” (wiki format specs), and the wiki and raw directories tell it “where your stuff lives and where input from others arrives.”

All five dimensions together make a complete identity.[10] Missing any one of them, and the Agent starts guessing — and the cost of guessing wrong is you cleaning up afterward.

graph TD subgraph identity["Agent Identity"] A["CLAUDE.md
Who you are, how you think"] B["METHODOLOGY.md
The principles you judge by"] C["Commands
What you do"] D["Skills
What you know"] E["Working Dirs
Where your stuff lives"] end A -->|defines| B B -->|guides| C B -->|informs| D C ---|operates on| E D ---|references| E style A fill:#6b8f71,color:#fff style B fill:#58a6ff,color:#fff style C fill:#c67a50,color:#fff style D fill:#c67a50,color:#fff style E fill:#7c6db5,color:#fff

An Agent's identity is a combination of five layers, not a single System Prompt — CLAUDE.md defines methodology, methodology guides Commands and Skills, and both operate on the working directories

One-line identity vs. full directory structure identity — Left: a single System Prompt. Right: a complete identity directory — miss any layer, and the Agent starts guessing.

Five Agents, Five Identity Designs

The Agent Team has five Layer 2 Specialist Agents, and each one’s identity design is different — because their responsibilities are completely different, and the thinking styles they need are completely different too. Rather than listing all five in a flat rundown (that would read too much like technical documentation), I’ll pick one key design decision moment from each.

Em: The Wiki Structure Defines the Thinking Style

Em is the Knowledge Agent, responsible for managing the entire Agent Team’s knowledge base. Its core responsibilities are ingest (transforming raw material into structured wiki pages), query (finding relevant knowledge in the wiki to answer questions), and lint (checking the wiki for consistency and completeness).

When designing Em, the most critical decision wasn’t what to write in its CLAUDE.md — it was how to design the wiki’s directory structure, because Em’s thinking style is determined by its wiki structure. The wiki has concepts (concept pages), entities (pages for people, places, and things), and sources (source records). Every time Em ingests something, it’s not just dumping data in — it’s making judgments: “Which concept does this piece of knowledge belong to?” “Which existing concepts does this concept cross-reference?” “Do any new concepts need to be created?” That entire judgment framework comes from the wiki’s structural design.

In other words, Em’s identity isn’t the seven words “I am the knowledge manager” — it’s the wiki’s entire classification system. That classification system determines how it understands new information, how it connects old knowledge, how it decides what’s worth recording and what can be discarded.

C7: The One Canonical Location Rule

C7 is the Asset Manager, responsible for managing executable assets like Skills, Tools, Plugins, MCP Servers, and Commands. The design’s core is a rule that sounds simple but has far-reaching consequences: every asset has exactly one canonical location. Anywhere else that needs it gets it provisioned — no copying.

This rule determines C7’s entire way of working. When someone asks “is there an X skill,” the answer always requires checking just one place: the index under the catalog directory, catalog/INDEX.md. When Dm builds a new Agent and needs a skill, the process isn’t “copy one from somewhere” — it’s “provision it from C7’s catalog.” When a skill needs updating, you only update the canonical location, and every provisioned copy knows where the source of truth lives.

There was one session where C7 ran independently for the first time. My opening message was “read CLAUDE.md to understand your role and workflow” — and it went ahead and processed the intake registration for 10 new skills in one go. That session ran for 11 minutes, touched 11 directories, and hit 4 errors along the way (missing paths, failed commands) — but C7 solved all of them on its own. Because its CLAUDE.md defined a clear methodology: path problem? Go back to the master catalog. Format problem? Go back to the quality contract. Its identity gave it a framework for solving problems, not just a checklist of “what you should do.”

Dm: Three Roles in One Identity

Dm is the Builder Agent (the architect), responsible for designing and building new Agents. But Dm’s identity design has one distinctive feature: it isn’t a single role — it’s a combination of three.

Architect handles design, using the Opus model because complete reasoning capacity is needed to cross-analyze wiki knowledge against design constraints
Scaffolder handles construction, using Sonnet mainly because this doesn’t require deep reasoning — it requires fast, accurate file output
Validator handles verification, checking the scaffold’s output against a build checklist for compliance

When designing Dm, the decision I faced was: should these three roles be three independent Agents, or three Subagents under one Agent?[7] The answer was the latter — because the Context between them needs heavy cross-referencing. Architect’s design plan needs to be passed in full to Scaffolder; Scaffolder’s output needs to be passed in full to Validator. With independent Agents, every handoff loses Context. In Subagent mode, the parent Agent (Dm itself) can ensure Context is passed completely between all three roles.

This is the same judgment criterion discussed in the Skill vs Subagent article: does the Context need to stay in the main conversation? In Dm’s case, yes — so the three roles are Subagents, not independent Agents.

graph LR subgraph Dm["Dm — Builder Agent"] A["Architect
Opus · design plan"] B["Scaffolder
Sonnet · build files"] C["Validator
verify checklist"] end A -->|"full Context
design plan"| B B -->|"full Context
scaffold output"| C style A fill:#6b8f71,color:#fff style B fill:#58a6ff,color:#fff style C fill:#c67a50,color:#fff

Dm's three roles are Subagents, ensuring Context passes completely through Architect → Scaffolder → Validator

G7: The Sacred Boundary

G7 is the Fleet Manager, responsible for managing all the Layer 3 development, operations, and work Agents. The single most important concept in G7’s identity design is “sacred boundary” — meaning: only G7 can talk to Layer 3 Agents. No other Layer 2 Agent can.

This isn’t a technical constraint — it’s a governance decision. If every Layer 2 Agent could dispatch Layer 3 Agents directly, then when your fleet grows from 5 to 50 Agents, who tracks which Agent is doing what? Who makes sure two Agents aren’t simultaneously editing the same file? Who handles graceful degradation when one Agent crashes?

These are all fleet management problems, and fleet management needs a single unified entry point. So G7’s sacred boundary isn’t a restriction — it’s protection, protecting the entire fleet from descending into chaos under multi-headed management.

One scenario makes this especially clear: when G7 was building a machine profile, it needed to check the local path for each registered Agent one by one. If GM were doing this instead, GM would need to know each Agent’s repo name and possible installation locations — that’s fleet management knowledge that has no business being in GM’s Context. G7 does this because fleet management is literally G7’s job. Its registry has the complete information. Its machine profile is its own operational state — it doesn’t need any other Agent’s help.

GM: The Last to Take Shape, Because Its Identity Is the Most Complex

GM was the last Agent to be designed, for a reason mentioned in the previous article: before GM was built, I was playing its role myself. But this also means GM’s identity design faced a unique challenge: its identity had to be distilled from my own behavior patterns.

The design process for GM took a completely different path from the other four Agents. For the other four, the process was “start with a need or purpose, then design the identity around that.” For GM, it was “start with a large volume of behavioral records, then distill the identity from them.” More concretely: I went through several rounds of discussion with Agent G (the bootstrap agent that preceded GM), and each round was working through the same kinds of questions: “This thing I’m doing manually right now — what kind of work is it, exactly? Should it be delegated to an existing Agent? Should a new Agent be created to take it over? Should a Skill be created to execute it? Should it stay with GM? Or should a human be the one to sign off on it?” After enough rounds of that, the distilled result was GM Agent — somewhat like deriving GM from a Definition of Done.

So GM’s core identity was defined as “conversation before dispatch” — meaning when GM receives any instruction, its first move isn’t to immediately assign work, but to first have a conversation with the human: confirm it understands the full shape of the request, confirm it has enough information, and only then begin coordinating other Agents. This principle was distilled from a pattern I kept correcting in Agent behavior: Agent G was always too eager to act. I kept having to hold it back — “wait, you’re jumping to conclusions too fast,” “don’t rush to generate the Prompt, let’s discuss it first,” “please consult the methodology and knowledge in Em’s wiki before answering me.”

So GM’s identity is essentially a list of “don’t do this” items: don’t rush to dispatch, don’t skip the conversation and go straight to action, don’t make design decisions without consulting Em’s knowledge, don’t assume you’ve fully understood the requirement. This is a lot like the “subtraction” concept from the calibration article — you have to know what not to do before you can do the right things precisely.

GM's identity as a list of 'don't do this' items — GM's identity is essentially negative space — the deliberately excluded actions define who it is.

And the process of designing GM was itself a case study in Agent Team collaboration: Dm’s Architect read the GM Methodology recommendations that Em had produced by cross-analyzing all relevant concepts in the wiki, read 16 use case scenarios, read all the architecture decisions — then produced GM’s design plan. Meanwhile, GM’s CLAUDE.md was required to have “zero references to Agent G” — a clean start, inheriting nothing from the bootstrap agent’s build context, because GM needed to think from its own identity rather than carrying Agent G’s old baggage.

References

[1] OpsClaw.ai — The SOUL Framework: Why Your AI Agent Needs an Identity (Not Just a Prompt) (distinguishing system prompt vs. SOUL file: a SOUL file creates an operator, a system prompt only creates a tool) https://opsclaw.ai/blog/soul-framework-ai-agent-identity

[2] arXiv — Why Do Multi-Agent LLM Systems Fail? (ICLR 2025, across 1,600+ traces, “Disobey role specification” is a key failure mode) https://arxiv.org/abs/2503.13657

[4] HumanLayer — Writing a Good CLAUDE.md (best practices for writing CLAUDE.md) https://www.humanlayer.dev/blog/writing-a-good-claude-md

[7] arXiv — Multi-Agent Teams Hold Experts Back (counterpoint: multi-agent teams can actually hold back expert-level agents in certain situations) https://arxiv.org/html/2602.01011v1

[10] Levi Ezra — Building AI Agents with Personas, Goals, and Dynamic Memory https://medium.com/@leviexraspk/building-ai-agents-with-personas-goals-and-dynamic-memory-6253acacdc0a

Support This Series

If these articles have been helpful, consider buying me a coffee

☕ Buy me a coffee