skills$openclaw/voice-reply
stolot0mt0m5.6k

by stolot0mt0m

voice-reply – OpenClaw Skill

voice-reply is an OpenClaw Skills integration for ai ml workflows. |

5.6k stars1.8k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026ai ml

Skill Snapshot

namevoice-reply
description| OpenClaw Skills integration.
ownerstolot0mt0m
repositorystolot0mt0m/voice-reply
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @stolot0mt0m/voice-reply
last updatedFeb 7, 2026

Maintainer

stolot0mt0m

stolot0mt0m

Maintains voice-reply in the OpenClaw Skills directory.

View GitHub profile
File Explorer
5 files
.
scripts
install.sh
3.9 KB
_meta.json
279 B
README.md
1.9 KB
SKILL.md
4.4 KB
SKILL.md

name: voice-reply version: 1.0.0 description: | Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag. metadata: openclaw: emoji: "🎤" os: ["linux"] requires: bins: ["ffmpeg"] env: ["SHERPA_ONNX_DIR", "PIPER_VOICES_DIR"]

Voice Reply

Generate voice audio replies using local Piper TTS via sherpa-onnx. Completely offline, no cloud APIs needed.

Features

  • 100% Local - No internet connection required after setup
  • No API Keys - Free to use, no accounts needed
  • Multi-language - German and English voices included
  • Telegram Ready - Outputs voice notes that display as bubbles
  • Auto-detect Language - Automatically selects voice based on text

Prerequisites

  1. sherpa-onnx runtime installed
  2. Piper voice models downloaded
  3. ffmpeg for audio conversion

Installation

Quick Install

cd scripts
sudo ./install.sh

Manual Installation

1. Install sherpa-onnx
sudo mkdir -p /opt/sherpa-onnx
cd /opt/sherpa-onnx
curl -L -o sherpa.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.23/sherpa-onnx-v1.12.23-linux-x64-shared.tar.bz2"
sudo tar -xjf sherpa.tar.bz2 --strip-components=1
rm sherpa.tar.bz2
2. Download Voice Models
sudo mkdir -p /opt/piper-voices
cd /opt/piper-voices

# German - thorsten (medium quality, natural male voice)
curl -L -o thorsten.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-medium.tar.bz2"
sudo tar -xjf thorsten.tar.bz2 && rm thorsten.tar.bz2

# English - ryan (high quality, clear US male voice)
curl -L -o ryan.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-high.tar.bz2"
sudo tar -xjf ryan.tar.bz2 && rm ryan.tar.bz2
3. Install ffmpeg
sudo apt install -y ffmpeg
4. Set Environment Variables

Add to your OpenClaw service or shell:

export SHERPA_ONNX_DIR="/opt/sherpa-onnx"
export PIPER_VOICES_DIR="/opt/piper-voices"

Usage

{baseDir}/bin/voice-reply "Text to speak" [language]

Parameters

ParameterDescriptionDefault
textThe text to convert to speech(required)
languagede for German, en for Englishauto-detect

Examples

# German (explicit)
{baseDir}/bin/voice-reply "Hallo, ich bin dein Assistent!" de

# English (explicit)
{baseDir}/bin/voice-reply "Hello, I am your assistant!" en

# Auto-detect (detects German from umlauts and common words)
{baseDir}/bin/voice-reply "Guten Tag, wie geht es dir?"

# Auto-detect (defaults to English)
{baseDir}/bin/voice-reply "The weather is nice today."

Output Format

The script outputs two lines that OpenClaw processes for Telegram:

[[audio_as_voice]]
MEDIA:/tmp/voice-reply-output.ogg
  • [[audio_as_voice]] - Tag that tells Telegram to display as voice bubble
  • MEDIA:path - Path to the generated OGG Opus audio file

Available Voices

LanguageVoiceQualityDescription
German (de)thorstenmediumNatural male voice, clear pronunciation
English (en)ryanhighClear US male voice, professional tone

Adding More Voices

Browse available Piper voices at:

Download and extract to $PIPER_VOICES_DIR, then modify the script to include the new voice.

Troubleshooting

"TTS binary not found"

Ensure SHERPA_ONNX_DIR is set and contains bin/sherpa-onnx-offline-tts.

"Failed to generate audio"

Check that voice model files exist: *.onnx, tokens.txt, espeak-ng-data/

Audio plays as file instead of voice bubble

Ensure the output includes [[audio_as_voice]] tag on its own line before the MEDIA: line.

Credits

README.md

voice-reply

Local Text-to-Speech for OpenClaw using Piper voices via sherpa-onnx

Generate voice audio replies that work as Telegram voice notes - 100% offline, no API keys required.

Features

  • 100% Local - No internet connection required after setup
  • No API Keys - Completely free, no accounts needed
  • Multi-language - German and English voices included (more available)
  • Telegram Ready - Outputs as voice bubbles, not file attachments
  • Auto-detect Language - Automatically selects the right voice

Quick Start

1. Install Dependencies

cd scripts
sudo ./install.sh

This installs:

  • sherpa-onnx runtime (~28 MB)
  • German voice "thorsten" (~64 MB)
  • English voice "ryan" (~110 MB)
  • ffmpeg (if not present)

2. Add to OpenClaw

Copy the skill to your OpenClaw skills directory:

cp -r . ~/.openclaw/skills/voice-reply

3. Use It

Ask your OpenClaw agent:

  • "Reply with a voice message"
  • "Say that as audio"
  • "Read this aloud: Hello world"

Or call directly:

/voice_reply "Hello, how are you?" en

Voices

LanguageVoiceQualitySize
Germanthorstenmedium64 MB
Englishryanhigh110 MB

More voices available at Piper Samples.

Requirements

  • Linux (Ubuntu 22.04+ recommended)
  • ~200 MB disk space
  • ~500 MB RAM during synthesis
  • ffmpeg

How It Works

  1. Text is converted to speech using sherpa-onnx with Piper VITS models
  2. WAV output is converted to OGG Opus (Telegram voice format)
  3. Output includes [[audio_as_voice]] tag for Telegram voice bubbles

Credits

License

MIT License

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

1. **sherpa-onnx** runtime installed 2. **Piper voice models** downloaded 3. **ffmpeg** for audio conversion

FAQ

How do I install voice-reply?

Run openclaw add @stolot0mt0m/voice-reply in your terminal. This installs voice-reply into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/stolot0mt0m/voice-reply. Review commits and README documentation before installing.