Prompt Engineering & Structured Output/Structured Output & Scale
Advanced10 min

Batch Processing Strategies

How to use the Anthropic Message Batches API for 50% cost savings on non-latency-sensitive workloads. Covers when to batch vs. when NOT to, custom_id correlation, failure handling, and the critical constraint that batch requests cannot do multi-turn tool calling.

Quick Reference

  • Message Batches API: 50% cost savings compared to synchronous API
  • Processing time: up to 24 hours, no guaranteed latency SLA
  • Appropriate for: overnight reports, weekly audits, nightly test generation, bulk extraction
  • NOT appropriate for: blocking pre-merge CI checks, real-time user-facing features
  • No multi-turn tool calling in batch requests -- each request is a single-turn exchange
  • Use custom_id to correlate request/response pairs across large batches
  • Handle partial failures: resubmit only failed requests, not the entire batch
  • Refine prompts on a sample set BEFORE batch-processing thousands of items
  • Batch API supports up to 100,000 requests per batch (check current limits)
  • Exam Scenario 5 (CI/CD): batch is WRONG for pre-merge checks but RIGHT for nightly audits

When to Use Batch vs. Synchronous API

The Message Batches API offers a simple tradeoff: you give up latency guarantees in exchange for 50% lower cost. The key question is whether your workflow can tolerate up to 24 hours of processing time.

CharacteristicSynchronous APIMessage Batches API
CostStandard pricing50% discount
LatencySeconds (with SLA)Up to 24 hours (no SLA)
Use caseReal-time, interactive, blockingBackground, overnight, non-blocking
Tool callingMulti-turn supportedSingle-turn only
Rate limitsStandard rate limits applySeparate batch limits (higher throughput)
Result deliveryImmediate in responsePoll for completion or use webhook
Max requestsOne at a time (parallel up to rate limit)Up to 100,000 per batch
Error handlingImmediate retryResubmit failed items after batch completes
Exam trap: batch API for CI/CD pre-merge checks

The exam will present a scenario where a team uses the batch API for pre-merge code review in their CI/CD pipeline, and developers complain about slow merge times. The answer is ALWAYS: switch to synchronous API for pre-merge checks. Batch API has no latency SLA and should never be used for blocking workflows. However, batch IS appropriate for nightly/weekly audit runs of the same codebase.