How it works
The/add-ollama-tool skill adds a stdio-based MCP server inside the agent container. The MCP server exposes two tools:
| Tool | Description |
|---|---|
ollama_list_models | Lists all locally installed Ollama models |
ollama_generate | Sends a prompt to a specified model and returns the response |
host.docker.internal:11434. Claude decides when to use local models based on task complexity — you don’t need to configure routing rules.
Prerequisites
Install Ollama and pull at least one model:Installation
Apply the skill in Claude Code:container/agent-runner/src/ollama-mcp-stdio.ts— MCP server that bridges to Ollamascripts/ollama-watch.sh— macOS notification watcher for Ollama status- Ollama MCP configuration in the agent runner
[OLLAMA]log surfacing in container output
Configuration
SetOLLAMA_HOST in .env if Ollama runs on a non-default address:
localhost if host.docker.internal fails.
Ollama must be running on the host before starting NanoClaw. The MCP server writes status to
/workspace/ipc/ollama_status.json so the host process can surface connection issues in logs.Usage
Once installed, Claude can use local models transparently. For example:“Summarize this document using a local model”Claude will call
ollama_list_models to see available models, then ollama_generate with the appropriate prompt. You can also be explicit about which model to use:
“Use llama3.2 to translate this to Spanish”
Third-party model endpoints
Independently of Ollama, NanoClaw supports any Anthropic API-compatible endpoint. Set these in.env:
- Open-source models on Together AI, Fireworks, etc.
- Custom model deployments with Anthropic-compatible APIs
Related pages
- Skills system — How skills work
- Configuration — Environment variables reference
- Container runtime — How agent containers work