The No-Framework Agent
How to build a production-ready agent with nothing but the Anthropic SDK and a while loop — and when that's still the right choice. Covers the full manual agent loop, token cost math, streaming, failure modes, testing, and the graduation path from manual loop to tool_runner to Agent SDK to LangGraph.
Quick Reference
- →The core agent loop: call LLM → check stop_reason → execute tool calls → append results → repeat
- →A production-ready no-framework agent needs: retries with backoff, a timeout, max_iterations cap, per-tool error handling, and structured logging
- →Token cost grows with every iteration — by iteration 4 you are re-sending all prior history plus tool results on every LLM call
- →The SDK tool_runner (beta) automates iteration, backoff, and dispatch in ~10 lines — evaluate it before building the manual loop
- →Graduation path: manual loop → tool_runner → Claude Agent SDK → LangGraph; move down only when the current tier creates friction you can name
- →Go framework-free when: one provider, ≤5 tools, no complex state, no multi-agent coordination, and debuggability matters
- →The signal to migrate: agent.py grows past 300 lines and you are hand-coding state machines
Should I Go Framework-Free?
The decision is about which problems you don't have yet. Frameworks solve real problems: state management, checkpointing, multi-agent coordination, provider abstraction. But if you don't have those problems, every abstraction is pure overhead — more dependencies, more layers to debug through, more ways things break without obvious cause. The table below maps your requirements to the right choice. Be honest about column two.
| Signal | Skip Framework | Use Framework |
|---|---|---|
| Provider count | One LLM provider | Multiple providers or need to switch |
| Tool count | 1–5 tools | 8+ tools with routing logic |
| State complexity | Messages array is sufficient | Complex state with branching logic |
| Conversation persistence | Stateless or simple DB | Checkpointing, time travel, session replay |
| Agent coordination | Single agent | Multiple agents collaborating |
| Human-in-the-loop | No interrupt/resume | Structured interrupt and resume patterns |
| Team size | 1–3 developers | Large team needing shared abstractions |
If your agent needs to resume from a checkpoint after a crash, coordinate two or more sub-agents, or expose a human approval gate that persists across requests — the manual loop will not stay simple. You will spend weeks reimplementing what LangGraph already provides. Start with the framework.