★ OverviewIntermediate14 min

OpenAI Function Calling & Responses API

OpenAI's native function calling lets the model invoke your code with structured arguments — no framework required. This article covers when to drop the abstraction layer, how to use the Responses API (the recommended path for new projects), and what breaks in production when tool loops run unchecked.

Quick Reference

→Use `client.responses.create()` for new projects — the Responses API is OpenAI's recommended path as of 2025
→Assistants API is deprecated: announced August 2025, sunset August 26, 2026 — do not build new projects on it
→Responses API tool schema is flatter: `{type: 'function', name: '...', description: '...', parameters: {...}}` — no nested `function` wrapper
→Tool calls appear as items in `response.output` with `item.type === 'function_call'`; submit results with `call_id`, not `tool_call_id`
→Structured outputs with Pydantic: use `client.beta.chat.completions.parse()` — typed response, no json.loads needed
→Always set `max_iterations` on your tool loop — the model can request tools indefinitely on ambiguous tasks
→Tool schemas cost tokens: two 150-token schemas at 10K requests/day ≈ $7.50/day at GPT-5.4 input pricing
→Use native API when you have one provider and a simple tool loop; reach for a framework when you need checkpointing, multi-provider routing, or human-in-the-loop

When to Go Native

Every LangChain agent is built on top of this. The question is whether the layers between you and the raw API are paying for themselves.

▸Go native when you use a single provider and your tool loop fits in under 100 lines — LangChain's ChatOpenAI → bind_tools → AgentExecutor adds three layers you don't need
▸Go native when you need maximum control over token usage, retry logic, and error handling — frameworks abstract the exact points you need to instrument in production
▸Go native when you want to understand the underlying protocol — you'll debug framework agents much faster once you've seen what they're wrapping
▸Use a framework when you need to switch providers (Claude → GPT → Gemini) without rewriting tool definitions — this is LangChain's actual value proposition
▸Use LangGraph specifically when you need persistent state between turns, human-in-the-loop interrupts, or branching agent workflows — these are genuinely hard to build from scratch
▸Use LangSmith when you need distributed tracing across an entire pipeline — the native SDK has no equivalent

The right test

Can you implement this tool loop in under 100 lines using just the OpenAI SDK? If yes, do it. The overhead of a framework only pays off when you need cross-provider compatibility, persistent state, or built-in observability — not for a simple search-and-respond agent.

OpenAI's API Landscape (2026)

OpenAI currently has three API surfaces for building agents. Knowing which one to use — and which to avoid — is the first decision you make when going native.

Function Calling with the Responses API

The Responses API differs from Chat Completions in two important ways: the tool schema is flatter (no nested function wrapper), and tool results use call_id instead of tool_call_id. These are small changes, but mixing them up silently breaks the loop.

Sign in to read this article

This is a premium article. Sign in with your Google account to continue.