Beginner5 min
Local Models
Run models locally with Ollama — no API keys, no network calls, no data leaving your machine. The same init_chat_model() interface works; swap the provider prefix and everything else stays identical.
Quick Reference
- →Install Ollama: brew install ollama (Mac) or https://ollama.com
- →Pull a model: ollama pull llama3.2
- →init_chat_model('ollama:llama3.2') — same interface as cloud providers
- →ChatOllama for full control: base_url, num_ctx, num_predict, stop sequences
- →Tool calling works with capable local models (llama3.2, mistral-nemo, qwen2.5)
Why Run Local Models
Local models are useful when data privacy is a hard requirement (no content leaves your machine), when you need offline operation, or when you want to cut API costs during development. Ollama is the simplest way to run models locally — it downloads and manages model weights and exposes an OpenAI-compatible API on localhost.