Async Subagents: Background Task Delegation
Async subagents (Deep Agents v0.5) let a supervisor delegate long-running tasks to background agents while continuing to chat with the user. This article covers the decision criteria for when async is worth the complexity, token cost math, production error handling, five concrete failure modes and their defenses, three orchestration patterns with code, and the five metrics you need to monitor before something breaks.
Quick Reference
- →Use async subagents when a task takes >10 seconds and the user shouldn't be blocked — anything shorter, sync is simpler
- →Correct API: subagents=[AsyncSubAgent(name=..., description=..., graph_id=...)] in create_deep_agent()
- →5-tool lifecycle injected per subagent: start_async_task, check_async_task, update_async_task, cancel_async_task, list_async_tasks
- →ASGI transport = co-deployed, in-process (default); HTTP transport = remote Agent Server (add url= param)
- →Each subagent gets its own context window — parallel N agents means N × system prompt overhead
- →Stale task IDs after context compaction: always recover with list_async_tasks() before checking a specific task
- →Monitor: task completion rate, duration P95, polling overhead ratio, error rate by subagent, orphaned task count
- →Deep Agents v0.5, April 2026 — async subagents are a preview feature; APIs may change
Should I Use Async Subagents?
Async subagents add real complexity: each one injects 5 tools into the supervisor's context, introduces polling logic, and requires lifecycle cleanup. Sync subagents (the task() tool) or a direct tool call are often the right answer. The question to ask first is whether the user actually needs to keep interacting while the work runs.
Supervisor manages background subagents via 5 lifecycle tools — continues chatting while tasks run
| Signal | Points to async | Points to sync |
|---|---|---|
| Task duration | >15–30 seconds | <10 seconds |
| User experience | Must keep chatting during work | User can wait for the result |
| Parallelism | 3+ independent subtasks | Sequential or single subtask |
| Error isolation | One failure shouldn't block others | All-or-nothing is fine |
| Deployment | Subagent needs independent scaling | Co-deployed is sufficient |
| State persistence | Work survives supervisor restart | Restart is acceptable |
| Aspect | Sync subagent (task()) | Async subagent |
|---|---|---|
| Execution | Blocks supervisor until complete | Runs in background, supervisor continues |
| User experience | User waits for all subtasks | User chats while tasks run |
| Tooling | 1 tool (task) | 5 tools per subagent |
| Error handling | Error propagates immediately | Supervisor polls status, handles errors |
| Context overhead | Subagent result in one message | N × system prompts + polling messages |
Ship with sync subagents (task() calls) first. Add async only when you have measured evidence that users are waiting too long. The 10-second heuristic is a starting point — your workload may be different.