Skip to main content
EMil Wu
Back to News

Eight Cents an Hour: Anthropic Will Run Your Agents for You

5 min read
Eight Cents an Hour: Anthropic Will Run Your Agents for You

A few days ago, while browsing Anthropic’s API docs, I noticed an entirely new section in the sidebar: “Managed Agents.”

Clicking through, it wasn’t another prompt management tool or some new SDK wrapper — it was a full hosted agent execution infrastructure. You define the agent’s brain, they handle the hands — sandboxing, state, security, scaling, all of it.

“Pre-built, configurable agent harness that runs in managed infrastructure.”

My first reaction: isn’t this basically Heroku for agents?

What You Used to Build Yourself

If you’ve ever tried pushing an AI agent from prototype to production, you know the pain. The model call itself is the tip of the iceberg, what actually eats your time is:

  • Sandbox environment: agents need to run code, so you need a secure container
  • State management: agent crashes mid-run — how do you recover from the checkpoint?
  • Credential security: agents need external service access, where do you put API keys so agent-generated code can’t steal them?
  • Observability: what did the agent do? Why? How long did it take?
  • Scaling: one agent runs fine, what about a hundred simultaneously?

Add it all up and a two-week agent prototype routinely takes months to actually ship, right?

Claude Managed Agents tackles this entire layer.

Four Core Concepts

Anthropic breaks Managed Agents into four concepts:

ConceptYou HandleAnthropic Handles
AgentModel, system prompt, tools, MCP servers
EnvironmentSpecify packages, network rulesContainer creation, isolation, security
SessionSend tasks, receive resultsExecution loop, state persistence, error recovery
EventsSend user turns, process responsesSSE streaming, event history persistence

Once your agent is defined, you open a session via API, send a message, and Claude autonomously executes inside a cloud container — running Bash, reading and writing files, searching the web, calling MCP servers — streaming results back via Server-Sent Events. You can even send new instructions mid-execution to steer the agent, or interrupt it entirely.

”Decoupling Brain from Hands” — OS-Level Architecture

This is the part I find most worth examining. Anthropic’s engineering team published a companion piece, “Scaling Managed Agents: Decoupling the brain from the hands,” explaining their architectural philosophy.

The traditional approach puts an agent’s reasoning engine (brain) and execution environment (hands) in the same container, which means: container dies, agent state is gone; container boots slowly, agent waits; containers compete for resources, scaling becomes a nightmare.

Managed Agents decouples three components entirely:

  1. Session: An append-only event log recording everything the agent has done. Exists independently of any container.
  2. Harness: The control loop that calls Claude and routes tool calls. Stateless, horizontally scalable.
  3. Sandbox: The container where agents run code. Treated as “cattle” — breaks, get a new one via provision({resources}), provisioned on demand.

The result? Time-to-first-token reduced by ~60% at p50, over 90% at p95. If the harness crashes, wake(sessionId) brings it back; if the container dies, the session log survives, spin up a new container and continue.

Their exact words: “The harness doesn’t know whether the sandbox is a container, a phone, or a Pokémon emulator.”

If you’ve read our Architecture Isn’t the Point, Context Is, there’s an interesting parallel here: Managed Agents treats the Session as a queryable memory store outside the context window. When long-running tasks exceed the window, the harness can selectively rewind and re-read via getEvents()context management becomes an infrastructure-level concern, no longer your prompt engineering problem.

Security Design

This part I think is elegantly done: credentials never enter the sandbox.

  • Git tokens are bound during container initialization — Git operations work, but agent-generated code can’t touch the tokens
  • OAuth and MCP tool credentials live in an external vault, retrieved through a dedicated proxy

Even if an agent writes malicious code, it can’t steal your API keys. This is a baseline requirement for production, but remarkably easy to overlook when building your own stack.

Pricing

ItemCost
Model tokensStandard API pricing
Agent runtime$0.08 / session-hour (billed per millisecond, idle time excluded)
Web search$10 / 1,000 searches

Eight cents an hour for agent execution — honestly, that’s cheap. If you’re running EC2 or Cloud Run to host agent sandboxes, the container costs alone exceed this, not to mention the orchestration, checkpointing, and error recovery you’d build yourself.

Who’s Already Using It?

CompanyUse Case
NotionDelegating tasks to agents directly within workspaces
RakutenEnterprise agents for sales, marketing, finance, integrated with Slack/Teams — deployed in under a week
AsanaProductivity and task automation
SentryDebugging agent + auto-patch writing + PR creation

Rakuten’s one-week deployment is the most compelling number — if you know how long enterprise AI integrations typically take.

Features Still in Research Preview

Three features require separate access:

  • Multi-agent: Agents can spawn other agents for complex subtasks
  • Outcomes: Automatic response quality evaluation and improvement (“improved task success by up to 10 points over standard prompting loops in internal testing”)
  • Memory: Cross-session memory management

The multi-agent feature is particularly interesting — it maps directly to the scenarios discussed in Agent Team: When Subagents Aren’t Enoughexcept this time, Anthropic manages the orchestration for you.

My Take

Maybe the biggest significance of Managed Agents isn’t the technology itself, but that it announces the birth of a new category: Agent-as-a-Service.

Think back to what Heroku did in 2007: you didn’t need to manage servers, just git push to deploy. Over a decade later, Managed Agents does something similar at the agent layer — no need to build your own sandbox, write orchestration loops, or handle checkpointing, just define your agent’s capabilities and leave the rest to the platform.

But the limitations deserve equal attention: it currently runs exclusively on Anthropic’s own infrastructure, with no availability through AWS Bedrock or Google Vertex AI. This is a strategic choice — Anthropic clearly wants to keep developers in their ecosystem — but for enterprises requiring multi-cloud deployment, this is a lock-in risk that demands evaluation.

And it only supports Claude models. If your agents need to switch between GPT and Claude across different scenarios, Managed Agents can’t accommodate that today.

Perhaps the most practical advice: if you’re building a new agent system from scratch, have a small team, and primarily use Claude, Managed Agents can eliminate massive amounts of infrastructure engineering. But if you already have a custom agent framework or need model flexibility, watching from the sidelines may be the steadier choice.

Regardless of your decision today, one thing is certain: agents moving from “build it yourself” to “let the platform handle it” — that road is now officially open.


References

Support This Series

If these articles have been helpful, consider buying me a coffee