Intermediate6 min
Batch Processing
Process multiple independent inputs in parallel with .batch(). Use batch_as_completed() to stream results as they finish and max_concurrency to control parallelism.
Quick Reference
- →model.batch([input1, input2, input3]) processes all inputs in parallel
- →batch_as_completed() yields results as each finishes — results may arrive out of order
- →Set max_concurrency in config to cap parallel calls
- →batch() is client-side parallelism — distinct from provider batch APIs (OpenAI, Anthropic)
- →Each response includes its input index for reordering
Basic Batch
When you have many independent inputs to process — classifying 100 documents, summarizing 50 articles, translating 200 strings — batch() parallelizes the calls client-side. All requests are sent concurrently and results are returned together once all complete.
Parallel processing with batch()
Client-side parallelism only
batch() parallelizes calls from your client — it sends multiple concurrent requests to the provider API. It is different from provider batch APIs (like OpenAI's /v1/batches or Anthropic's Batch API) which are async server-side processing. Those are cheaper but take hours; batch() returns in seconds.