LangChain/Models
Intermediate6 min

Batch Processing

Process multiple independent inputs in parallel with .batch(). Use batch_as_completed() to stream results as they finish and max_concurrency to control parallelism.

Quick Reference

  • model.batch([input1, input2, input3]) processes all inputs in parallel
  • batch_as_completed() yields results as each finishes — results may arrive out of order
  • Set max_concurrency in config to cap parallel calls
  • batch() is client-side parallelism — distinct from provider batch APIs (OpenAI, Anthropic)
  • Each response includes its input index for reordering

Basic Batch

When you have many independent inputs to process — classifying 100 documents, summarizing 50 articles, translating 200 strings — batch() parallelizes the calls client-side. All requests are sent concurrently and results are returned together once all complete.

Parallel processing with batch()
Client-side parallelism only

batch() parallelizes calls from your client — it sends multiple concurrent requests to the provider API. It is different from provider batch APIs (like OpenAI's /v1/batches or Anthropic's Batch API) which are async server-side processing. Those are cheaper but take hours; batch() returns in seconds.