What's Actually the Difference Between a Skill and a Subagent?

Contents

They Don’t Live on the Same Layer
Context Flow Is the Real Key
But the Lines Are Blurring
Use Token Consumption as a Shortcut?
Blind Spot One: Low Tokens, but Isolation Still Needed
Blind Spot Two: High Tokens, but Interaction Is Required
Further Reading

Skill as embedded knowledge partner vs Subagent as independent executor — A Skill sits at your desk and works alongside you; a Subagent works in a separate room and sends back a summary note

The first four articles established a complete AI Agent framework:

Model ≠ Runtime: Skills live inside the Runtime
The five-layer architecture: Command → Agent → Tool + Skill → Context
Context Engineering is the foundation: JIT, Token Budgets, Progressive Disclosure
The future of AI competition lies in the Skill ecosystem

But once you actually start building an Agent system, you’ll run into a practical question pretty quickly:

Should I package this as a Skill or a Subagent?

Most people’s instinct is to look at complexity — simple things become Skills, complex things become Subagents. That instinct isn’t quite right, because complexity isn’t the deciding factor. Context is.

They Don’t Live on the Same Layer

A lot of people put Skills and Subagents on the same spectrum, treating a Subagent as just a “more powerful Skill.” But go back to the five-layer architecture from article two:

Command → Agent → [Tool + Skill] → Context

A Skill is the knowledge layer sitting alongside the Agent — it tells the Agent how to think. A Subagent is another Agent entirely — with its own Context Window, its own tools, and its own permissions.

Main Agent
├── Tool (executes actions)
├── Skill (provides knowledge)
└── Subagent (an independent Agent with everything of its own)
      ├── Tool
      ├── Skill
      └── Context (fully isolated)

Skill → a knowledge module, loaded into the main Agent’s Context
Subagent → an independent executor, with its own Context

This difference looks subtle, but it governs the entire Context flow of your system.

Context Flow Is the Real Key

Article three made the point: every unnecessary token actively degrades system performance. Apply that lens to Skills vs. Subagents:

When you use a Skill:

graph TD subgraph WIN["Main Agent Context Window (keeps growing)"] SP["System Prompt"] TD["Tool Definitions"] SK["Skill knowledge (stays here after loading)"] HIS["Conversation history"] MID["In-progress task state"] end

Using a Skill: everything accumulates inside the same Context Window

Skills use Progressive Disclosure to control how much gets loaded, but once loaded, the knowledge stays in the main Context. Every intermediate output from the task piles up inside the main Context Window.

When you use a Subagent:

graph LR subgraph MAIN["Main Agent Context"] SP["System Prompt"] TD["Tool Definitions"] HIS["Conversation history"] SUM["Summary <2000 tokens"] end subgraph SUB["Subagent Context (isolated)"] SPR["Subagent Prompt"] ST["Dedicated Tools"] MID["Extensive intermediate work
(all stays here)"] end SUB -- "one-way summary return" --> MAIN

Using a Subagent: intermediate work is isolated, only a concise summary returns

The Subagent’s intermediate work stays inside its own Context Window. Only the summary comes back to the main Agent. This is the Subagent Return Contract from article three: explore deeply, return shallowly.

Loading chart…

Skills cause Context to keep growing; Subagents isolate intermediate work, keeping the main Context nearly unchanged

Two containers: Skill fills up the main Context while Subagent keeps it clean — The left jar (main Context with Skill) fills up with every step; the right jar (main Context with Subagent) stays almost empty — the work happened in a separate jar beside it

So choosing between a Skill and a Subagent is really answering one question:

Does the intermediate process of this task have value to the main conversation?

Yes → keep it in the main Context → Skill No → isolate it, just take the result → Subagent

But the Lines Are Blurring

The distinction above is conceptually clean, but in practice, Skills have been gaining capability. In late 2025, Anthropic released Skills 2.0, and now a Skill can:

Use context: fork to run itself inside the isolated environment of a Subagent
Use allowed-tools to restrict which tools are available
Use model to override the model being used

In other words, a Skill can now configure itself to behave like a Subagent. That means a Skill is no longer just “static knowledge” — it can be a complete agent configuration.

So does the distinction between Skills and Subagents still hold?

Yes. The core separation still stands:

A Skill starts from knowledge — “here’s how to think about this” — and can be run inline or forked
A Subagent starts from isolation — “go do the work and come back with a report” — it’s independent Context by nature

Skills 2.0 lets a Skill choose whether to isolate, but a Subagent is born isolated. It’s like the difference between someone who can opt to work independently versus someone who is a natural independent contractor. The starting point is different; the design intent is different.

Skills 2.0 blurring boundary: a Skill can choose to fork into isolation mode, but a Subagent is born independent — Skills 2.0: a Skill can step across the threshold into its own private workspace, but the separate building on the far right (a pure Subagent) was always its own independent structure

Use Token Consumption as a Shortcut?

Once you understand Context flow, a lot of people land on a quick mental shortcut:

High token consumption → Subagent Low token consumption → Skill

This intuition is right most of the time. High token consumption usually means lots of intermediate work, and lots of intermediate work usually means it has no lasting value to the main conversation.

But there are two blind spots.

A lightweight task produces only a few hundred tokens, but you need to restrict tool permissions — say, a code reviewer that can read but must never write. Not many tokens, but you don’t want it to have write access. Use a Subagent.

You’re working through a complex architecture design that requires back-and-forth conversation and iterative revisions. A Subagent goes off, does its work, and comes back — you can’t step in mid-task. But keeping everything in the main Context will blow up the token count.

The first blind spot is easy to solve — just use a Subagent. The second blind spot is a genuine dilemma: a Skill will stuff the Context, a Subagent loses the interactivity. What do you do?

The answer isn’t choosing one or the other — it’s combining them, which is exactly what the next article covers.

Support This Series

If these articles have been helpful, consider buying me a coffee

☕ Buy me a coffee

What's Actually the Difference Between a Skill and a Subagent?

They Don’t Live on the Same Layer

Context Flow Is the Real Key

But the Lines Are Blurring

Use Token Consumption as a Shortcut?

Blind Spot One: Low Tokens, but Isolation Still Needed

Blind Spot Two: High Tokens, but Interaction Is Required

Further Reading

Support This Series