LangChain/Models
Beginner8 min

Model Configuration

Configure temperature, max_tokens, retries, timeouts, and rate limiting when initializing a model. Track token usage across multiple models with UsageMetadataCallbackHandler.

Quick Reference

  • init_chat_model('model-id', temperature=0.7, max_tokens=1000, timeout=30)
  • max_retries=6 by default — increase to 10–15 for long-running tasks on unreliable networks
  • rate_limiter=InMemoryRateLimiter(...) prevents hitting provider rate limits
  • UsageMetadataCallbackHandler tracks token counts across all models in a session
  • logprobs=True on bind() returns per-token probabilities (OpenAI only)

Core Parameters

ParameterTypeDefaultPurpose
modelstrrequiredModel name or 'provider:model' shorthand
temperaturefloatvariesRandomness — 0 = deterministic, 1+ = creative
max_tokensintvariesMax response length in tokens
timeoutfloatNoneSeconds before the request is cancelled
max_retriesint6Retry attempts on network/rate-limit errors
api_keystrenv varAuth key — usually set via environment variable
Configuring a model with init_chat_model
Increase max_retries for long-running agents

The default is 6 retries with exponential backoff. For long-running agent tasks on unreliable networks, set max_retries=10–15. Network errors, 429 rate limits, and 5xx server errors are retried automatically. 401/404 errors are not retried.