Skip to main content

Start here

The fastest path is the built-in debug skill: open Claude Code in your NanoClaw folder and run /debug. It knows the two-DB session architecture, the log locations, and how to query the session databases directly. If you’re diagnosing by hand, check the logs first. The service writes stdout to logs/nanoclaw.log and stderr (warnings and errors) to logs/nanoclaw.error.log:
tail -f logs/nanoclaw.log          # full routing chain: inbound, spawn/exit, delivery
tail -n 50 logs/nanoclaw.error.log # delivery failures, crash-loop backoff, warnings
Set LOG_LEVEL=debug (see Configuration) to also see resolved mount configurations and streamed container stderr tagged with container=<group folder>. Containers run with --rm, so the host log is the only place container output survives. Then get the system state from the ncl CLI:
ncl sessions list           # container_status: running | stopped
ncl dropped-messages list   # messages the router or access gate refused
ncl wirings list            # which chats route to which agents, and how they engage

Host won’t start

“NanoClaw stopped: update did not go through the supported path” — the upgrade tripwire fired. The code version doesn’t match the marker in data/upgrade-state.json, which usually means you ran a raw git pull instead of /update-nanoclaw. See Upgrading for the recovery path. “Circuit breaker: delaying startup due to repeated crashes” in the log — the host crashed repeatedly and is backing off (up to 15 minutes after six consecutive crashes). It will retry on its own; find the original crash in logs/nanoclaw.error.log (look for FATAL Uncaught exception) and fix that instead of restarting in a loop.

”FATAL: Container runtime failed to start”

On startup the host runs docker info. If it fails, NanoClaw prints this banner and exits:
╔════════════════════════════════════════════════════════════════╗
║  FATAL: Container runtime failed to start                      ║
║                                                                ║
║  Agents cannot run without a container runtime. To fix:        ║
║  1. Ensure Docker is installed and running                     ║
║  2. Run: docker info                                           ║
║  3. Restart NanoClaw                                           ║
╚════════════════════════════════════════════════════════════════╝
Do exactly what it says: start Docker (Docker Desktop on macOS, sudo systemctl start docker on Linux), confirm docker info succeeds, then restart NanoClaw.

Agent never replies

Trace the message through the pipeline, in order:
1

Did the router accept it?

ncl dropped-messages list
Each row aggregates drops per chat, with a reason:
ReasonMeaningFix
no_agent_wiredNo wiring exists for that chatCreate one: ncl wirings create --messaging-group-id <id> --agent-group-id <id>
no_agent_engagedA wiring exists but its engage rules didn’t fireCheck engage_mode: mention needs an @mention (or a DM); pattern matches engage_pattern against each message
unknown_sender_strictSender not recognized, strict policyAdd the sender, or relax the policy — see Hardening
unknown_sender_request_approvalSender not recognized; an Allow/Deny card was sent to an approver’s DMThe approver answers the card in their DM; ncl dropped-messages list shows the recorded drop either way
2

Did a container spawn?

ncl sessions list                      # container_status should be "running"
grep 'Spawning container' logs/nanoclaw.log | tail -5
docker ps --format '{{.Names}} {{.Status}}' | grep nanoclaw
stopped is normal between messages — the sweep restarts the container when due messages arrive. If nothing spawns at all, see Container won’t spawn below.
3

Did the agent produce a reply?

Inspect the session DBs (inbound.db / outbound.db under data/v2-sessions/<group>/<session>/). The /debug skill walks you through the exact queries. A Container exited line with a non-zero code in the host log, plus the streamed stderr above it, is where the failure usually shows.

Container won’t spawn

The first two of these abort the spawn; a rejected mount only degrades it (the container starts without that mount). All three appear in logs/nanoclaw.error.log:
  • OneCLI gateway not applied — refusing to spawn container without credentials — the host can’t wire the credential vault, so it refuses to launch the agent. The message stays pending and the sweep retries every minute; fix the gateway (is OneCLI running on 127.0.0.1:10254?) and the spawn recovers on its own. See Credentials.
  • Egress lockdown errors — with lockdown enabled, the spawn fails fast rather than run with open egress: the "<network>" internal network could not be created or the OneCLI gateway "<container>" could not be attached to "<network>". Check docker network inspect for the egress network and see Hardening.
  • Additional mount REJECTED — a mount from your group’s container config failed allowlist validation. The log line includes the requestedPath and reason. Fix the allowlist — see Hardening.

Replies are slow

  • Cold start: the first reply after a container spawn takes 30–60 seconds while the sandbox warms up. Setup tells you this during its ping test; it’s normal.
  • Scheduled or retried messages wait for the sweep: the host sweep runs every 60 seconds, so a due message for a stopped container can sit up to a minute before the wake fires (Waking container for due messages in the log).
  • Retries back off: a message reset after a crash retries with exponential backoff (5s, 10s, 20s, …). After five tries you’ll see Message marked as failed after max retries — that message is dead; resend it.

Container killed mid-task

The host sweep kills containers it considers stuck, then resets their in-flight messages to pending:
  • Killing container past absolute ceiling — the container’s heartbeat file went silent for over 30 minutes. The ceiling stretches automatically when the agent declares a longer Bash timeout, so long-running commands aren’t killed as long as they declare their timeout.
  • Killing container — message claimed then silent — the container claimed a message and showed no sign of life for over 60 seconds since the claim (also extended by a declared Bash timeout).
Both paths are self-healing: Reset stale message with backoff follows in the log and the work retries in a fresh container. If the same message keeps killing containers, look at the streamed container stderr (LOG_LEVEL=debug) to see what the agent was doing when it died.

Webhook channel is silent

Webhook-based adapters (Slack, Teams, and similar) register routes on a shared local HTTP server. Confirm it’s up and routed:
grep 'Webhook server started' logs/nanoclaw.log   # shows port + registered adapters
curl -i http://localhost:3000/webhook/slack       # 404 "Unknown adapter" = not registered
The server listens on WEBHOOK_PORT (default 3000). If another process already holds that port, the listen fails and the host crashes with FATAL Uncaught exception — change WEBHOOK_PORT in .env. Channel-specific failures (tokens, tunnel/public URL, platform-side config) are covered on each channel page: Slack, Teams, Channels overview.

Credential request stuck

If an agent’s API call hangs and then fails, a credential approval card may be waiting. The gateway holds the request open until an admin taps Approve or Reject; with no eligible approver it auto-denies, and unanswered cards expire. Check what’s pending:
ncl approvals list --status pending
The full approval flow — who gets the card, expiry, what happens after a host restart — is in Credentials.

Resetting a session

To wipe a misbehaving session’s conversation state, remove its folder — the host re-provisions both session DBs on the next message:
ncl sessions list                              # find the session and its agent group
rm -rf data/v2-sessions/<group>/<session>/

Getting help

  • Run /debug in Claude Code — it can read the logs and query the session DBs for you.
  • Join the Discord for help from other users.
  • Open an issue if you’ve found a bug.
Last modified on June 10, 2026