skills$openclaw/whisper-mlx-local
impkind5.4k

by impkind

whisper-mlx-local – OpenClaw Skill

whisper-mlx-local is an OpenClaw Skills integration for coding workflows. Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

5.4k stars1.3k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

namewhisper-mlx-local
descriptionFree local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs. OpenClaw Skills integration.
ownerimpkind
repositoryimpkind/whisper-mlx-local
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @impkind/whisper-mlx-local
last updatedFeb 7, 2026

Maintainer

impkind

impkind

Maintains whisper-mlx-local in the OpenClaw Skills directory.

View GitHub profile
File Explorer
10 files
.
scripts
daemon.py
9.6 KB
transcribe_large.sh
1.2 KB
transcribe.sh
3.5 KB
transcriber_cli.py
1.9 KB
transcriber.py
11.5 KB
_meta.json
461 B
README.md
1.2 KB
requirements.txt
461 B
SKILL.md
2.9 KB
SKILL.md

name: whisper-mlx-local description: "Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs." metadata: openclaw: emoji: "🎤" version: "1.5.0" author: "Community" repo: "https://github.com/ImpKind/local-whisper" requires: os: ["darwin"] arch: ["arm64"] bins: ["python3"] install: - id: "deps" kind: "manual" label: "Install dependencies" instructions: "pip3 install -r requirements.txt"

Local Whisper

Transcribe voice messages for free on Telegram and WhatsApp. No API keys. No costs. Runs on your Mac.

The Problem

Voice transcription APIs cost money:

  • OpenAI Whisper: $0.006/minute
  • Groq: $0.001/minute
  • AssemblyAI: $0.01/minute

If you transcribe a lot of Telegram voice messages, it adds up.

The Solution

This skill runs Whisper locally on your Mac. Same quality, zero cost.

  • ✅ Free forever
  • ✅ Private (audio never leaves your Mac)
  • ✅ Fast (~1 second per message)
  • ✅ Works offline

⚠️ Important Notes

  • First run downloads ~1.5GB model — be patient, this only happens once
  • First transcription is slow — model loads into memory (~10-30 seconds), then it's instant
  • Already using OpenAI API for transcription? Replace your existing tools.media.audio config with the one below

Quick Start

1. Install dependencies

pip3 install -r requirements.txt

2. Start the daemon

python3 scripts/daemon.py

First run will download the Whisper model (~1.5GB). Wait for "Ready" message.

3. Add to OpenClaw config

Add this to your ~/.openclaw/openclaw.json:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "~/.openclaw/workspace/skills/local-whisper/scripts/transcribe.sh",
            "args": ["{{MediaPath}}"],
            "timeoutSeconds": 60
          }
        ]
      }
    }
  }
}

4. Restart gateway

openclaw gateway restart

Now voice messages from Telegram, WhatsApp, etc. will be transcribed locally for free!

Manual test

./scripts/transcribe.sh voice_message.ogg

Use Case: Telegram Voice Messages

Instead of paying for OpenAI API to transcribe incoming voice messages, point OpenClaw to this local daemon. Free transcription forever.

Auto-Start on Login

cp com.local-whisper.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.local-whisper.plist

API

Daemon runs at localhost:8787:

curl -X POST http://localhost:8787/transcribe -F "file=@audio.ogg"
# {"text": "Hello world", "language": "en"}

Translation

Any language → English:

./scripts/transcribe.sh spanish_audio.ogg --translate

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.9+

License

MIT

README.md

🎤 Local Whisper

ClawdHub GitHub

Transcribe voice messages for free. Runs locally on your Mac.

Why?

Voice transcription APIs charge per minute. This skill does the same thing for free — runs Whisper directly on your Mac's Apple Silicon.

  • Free — no API costs, ever
  • Private — audio stays on your machine
  • Fast — ~1 second per voice message
  • Offline — works without internet

Install

git clone https://github.com/ImpKind/local-whisper
cd local-whisper
pip3 install -r requirements.txt

Use

# Start daemon (keeps model loaded)
python3 scripts/daemon.py

# Transcribe
./scripts/transcribe.sh voice_message.ogg

Auto-Start

cp com.local-whisper.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.local-whisper.plist

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.9+

License

MIT

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

- macOS with Apple Silicon (M1/M2/M3/M4) - Python 3.9+

FAQ

How do I install whisper-mlx-local?

Run openclaw add @impkind/whisper-mlx-local in your terminal. This installs whisper-mlx-local into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/impkind/whisper-mlx-local. Review commits and README documentation before installing.