Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nanoclaw.dev/llms.txt

Use this file to discover all available pages before exploring further.

v1-only feature. The /add-voice-transcription skill no longer ships in v2 — it is not present on trunk, the channels branch, or the providers branch. Voice transcription was a v1 fork-only capability tied to WhatsApp via the deprecated nanoclaw-whatsapp fork. This page is pending deletion; the instructions below do not apply to v2 installs.
NanoClaw can transcribe voice messages so the agent understands audio content. Two options are available: cloud-based (OpenAI Whisper API) and fully local (whisper.cpp).
Voice transcription is currently WhatsApp-only. Both skills live on the nanoclaw-whatsapp fork.

Cloud transcription (Whisper API)

The /add-voice-transcription skill uses OpenAI’s Whisper API for transcription. Cost: ~$0.006 per minute of audio.

Prerequisites

Installation

# On your nanoclaw-whatsapp fork
git fetch whatsapp skill/voice-transcription
git merge whatsapp/skill/voice-transcription
Or via Claude Code:
/add-voice-transcription

Configuration

Add your OpenAI API key to .env:
OPENAI_API_KEY=sk-...

How it works

  1. A WhatsApp voice note arrives
  2. The WhatsApp channel auto-downloads the audio file
  3. The audio is sent to OpenAI’s Whisper API
  4. The transcription is injected into the message content before the agent sees it
The agent receives the transcribed text as if the user had typed it — no special handling needed.

Local transcription (whisper.cpp)

The /use-local-whisper skill switches from the cloud API to on-device transcription using whisper.cpp. No API key needed, no cost, fully offline.

Prerequisites

  • Apple Silicon Mac (recommended for performance)
  • Homebrew packages:
    brew install whisper-cpp ffmpeg
    
  • A GGML model file (downloaded during setup)

Installation

# On your nanoclaw-whatsapp fork (requires voice-transcription first)
git fetch whatsapp skill/use-local-whisper
git merge whatsapp/skill/use-local-whisper
Or via Claude Code:
/use-local-whisper

How it works

Same flow as cloud transcription, but audio is processed locally using the whisper.cpp CLI instead of the OpenAI API. The tradeoff is speed — local transcription is slower than the API, especially on longer voice notes, but it’s free and private.

Comparison

Whisper APILocal whisper.cpp
Cost~$0.006/minFree
SpeedFast (cloud)Slower (on-device)
PrivacyAudio sent to OpenAIFully local
RequirementsOPENAI_API_KEYApple Silicon, whisper-cpp, ffmpeg
OfflineNoYes
Last modified on May 2, 2026