Advanced13 min

RemoteGraph

When and how to run LangGraph agents as remote services. Decision framework for when not to use RemoteGraph, direct subgraph embedding, production-shaped supervisor with error handling, thread-based state persistence, and the failure modes that reliably bite in production.

Quick Reference

→Initialize with positional name: RemoteGraph('agent', url=..., api_key=...) — the graph_id keyword does not exist
→Embed directly as a subgraph node: builder.add_node('child', remote_graph) — no manual wrapper function needed
→NEVER call a RemoteGraph that targets the same deployment — deadlocks and resource exhaustion
→Thread persistence: pass {'configurable': {'thread_id': '...'}} to maintain conversation state across remote calls
→Enable distributed tracing: RemoteGraph('agent', url=..., distributed_tracing=True) for end-to-end LangSmith traces
→Stream modes over HTTP: 'messages' (token-by-token), 'updates' (after each node), 'values' (full state snapshot)
→LangGraph Platform is now called LangSmith Deployment — same infrastructure, naming changed Oct 2025

When to Use RemoteGraph (and When Not To)

RemoteGraph adds a network boundary to your agent architecture. That boundary has real costs: per-call HTTP latency, serialization overhead, network failure modes, and debugging complexity. The question is never 'can I use RemoteGraph?' — you always can. The question is 'does the benefit justify the operational cost?'

Factor	Use RemoteGraph	Keep Local
Team ownership	Different teams own different agents, need independent deploys	Single team owns the entire graph
Scaling needs	Sub-agents need different compute (GPU vs CPU, memory-optimized)	All nodes run on the same instance
Latency budget	User-facing latency > 500ms is acceptable; streaming mitigates perception	Need sub-100ms p50; HTTP overhead is unacceptable
Fault isolation	A failing worker should not crash the supervisor	Single-process failure modes are acceptable
Deployment cadence	Sub-agents deploy on different release schedules	Everything deploys together

The deadlock you will not see coming

Do NOT use RemoteGraph to call itself or another graph on the same deployment. This causes deadlocks and resource exhaustion. Each incoming request consumes a worker slot; a call back to the same deployment waits for a worker that is already occupied. In a pool of 4 workers, 4 concurrent requests can deadlock the entire deployment. This is the single most common production incident with RemoteGraph.

Stop at the first YES — only reach "Use RemoteGraph" if all three checks pass

How RemoteGraph Works

RemoteGraph implements the same interface as CompiledGraph — the network boundary is invisible to calling code

Direct Subgraph Embedding

The current recommended pattern for composing graphs with RemoteGraph is direct node embedding: pass the RemoteGraph instance directly to add_node(). This is simpler than the old wrapper-function approach, automatically propagates streaming, and handles state channel mapping without manual marshalling.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.