Advanced18 min

Building Production MCP Servers

Build, version, deploy, and monitor production MCP servers with both the TypeScript SDK and FastMCP. Covers the build-vs-buy decision, schema versioning, deployment cost math, the gateway pattern for multi-server architectures, and three-tier health monitoring — because only 9% of remote MCP endpoints are fully healthy in the wild.

Quick Reference

→Build custom MCP servers only for internal APIs with no community equivalent — 10,000+ public servers already exist
→TypeScript SDK (@modelcontextprotocol/sdk v2): reference implementation, Express/Hono middleware support
→Python SDK (FastMCP v2): Pydantic schemas, one-line mcp.run(transport='streamable-http')
→Tool descriptions must start with a verb — the model uses the first word for tool identification
→Schema versioning: additive changes are safe; renames/removals need major bumps + migration windows
→Gateway pattern: mandatory at 3+ servers — single endpoint, centralized health, auto-namespacing
→Three-tier health: Liveness (HTTP 200) + Readiness (tools/list) + Operational (tool call succeeds)
→Dynamic tool loading: agents degrade past 7–10 tools — expose only task-relevant tools per session

Should You Build an MCP Server?

Before writing a single line of server code, spend two minutes on this decision. As of April 2026, over 10,000 public MCP servers exist — GitHub, Slack, Postgres, Stripe, Jira, and dozens of other standard integrations are maintained by their vendors or active communities. Building a custom server for one of these is waste. Build only when no public server covers your use case, or when you need to expose internal APIs that can never be public.

Build-vs-buy decision tree for MCP servers

Scenario	Decision	Reason
Internal API (no public server exists)	Build	No alternative — your business logic lives here
GitHub, Slack, Postgres, Stripe	Use community server	Vendor-maintained, tested against real API changes
Single agent needs one tool	Use @tool decorator	MCP adds 10–50ms latency and a process boundary — justify it with reuse
Community server exists but needs custom auth/rate-limit	Wrap with interceptors	Client-side interceptors (see mcp-interceptors article) are cheaper than a new server
3+ agents need the same internal data source	Build	The reuse across agents justifies the maintenance cost

The single-agent rule

If a tool is used by exactly one agent and needs LangGraph State access, skip MCP entirely. Use a @tool decorator. MCP adds a process boundary, a network hop, and 10–50ms per call. That overhead is justified when multiple agents or clients share the server — not before.

Choosing Your SDK: TypeScript vs Python

The official MCP v2 spec (released March 2026) has SDKs in TypeScript, Python, Java, and C#. For most teams the choice is between TypeScript (the reference implementation) and Python (FastMCP, which adds a decorator API and Pydantic schemas). Pick based on what your team already deploys — both SDKs are production-grade and both support Streamable HTTP transport.

Building a Production Server

Three principles separate a production MCP server from a demo: (1) tool descriptions start with a verb and state what the tool does NOT do, (2) errors return structured JSON with isError: true plus an errorCategory, and (3) resources carry reference data so tools stay action-oriented. Get these right before adding health checks — a server with good tool descriptions fails predictably; one with vague descriptions fails mysteriously.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.