skills$openclaw/audio-transcriber

6.9k★

audio-transcriber – OpenClaw Skill

Name: audio-transcriber
Author: snail3d

audio-transcriber is an OpenClaw Skills integration for coding workflows. Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable.

6.9k stars4.6k forksSecurity L1

Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

name	audio-transcriber
description	Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable. OpenClaw Skills integration.
owner	snail3d
repository	snail3d/voice-devotionalpath: temp-transcriber/audio-transcriber
language	Markdown
license	MIT
topics
security	L1
install	openclaw add @snail3d/voice-devotional:temp-transcriber/audio-transcriber
last updated	Feb 7, 2026

Maintainer

snail3d

Maintains audio-transcriber in the OpenClaw Skills directory.

View GitHub profile

File Explorer

3 files

audio-transcriber

scripts

transcribe.py

4.1 KB

SKILL.md

3.1 KB

SKILL.md

name: audio-transcriber description: Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable.

Audio Transcriber

Overview

This skill enables fast audio transcription using Groq's Whisper API. Transcription happens in the cloud via Groq's infrastructure, providing significantly faster results than local Whisper models.

Quick Start

When a user sends an audio file or voice message:

Ensure GROQ_API_KEY is set in environment
Use the transcribe script: scripts/transcribe.py /path/to/audio.ogg
Return the transcribed text to the user

Usage

Basic Transcription

export GROQ_API_KEY="your-key-here"
python3 /path/to/audio-transcriber/scripts/transcribe.py /path/to/audio.ogg

The script:

Accepts any audio format (ogg, mp3, wav, m4a, etc.)
Automatically converts to WAV (16kHz, mono) using ffmpeg (if available)
Sends to Groq's Whisper API for transcription
Outputs plain text to stdout

Supported Audio Formats

Voice messages: OGG (Telegram, Signal, etc.)
Common formats: MP3, WAV, M4A, FLAC, AAC
Container formats: The script handles conversion automatically if ffmpeg is installed
Without ffmpeg: Only WAV files are supported

Setup Requirements

The skill requires these to be configured:

1. Groq API Key

Get an API key from https://console.groq.com/

Set as environment variable:

export GROQ_API_KEY="your-key-here"

For persistent setting, add to your shell profile (~/.zshrc or ~/.bashrc):

echo 'export GROQ_API_KEY="your-key-here"' >> ~/.zshrc

2. ffmpeg (recommended)

brew install ffmpeg

Without ffmpeg, only WAV files will work. ffmpeg is used to convert other formats to WAV before sending to Groq.

Resources

scripts/transcribe.py

Main transcription script that:

Validates GROQ_API_KEY environment variable
Checks for ffmpeg (optional but recommended)
Converts audio to WAV format if needed
Sends to Groq's Whisper API (whisper-large-v3 model)
Extracts and outputs plain text

Run directly from command line or via exec tool.

Performance Notes

Speed: Much faster than local Whisper (typically <1 second for short messages)
Model: Uses whisper-large-v3 via Groq API (high accuracy)
Latency: Cloud-based, depends on internet connection
Cost: Groq offers free tier; check current pricing for usage limits
Accuracy: Excellent for general speech; handles:
- Multiple accents and dialects
- Multiple speakers (moderately)
- Noisy environments
- Technical jargon

Troubleshooting

"GROQ_API_KEY environment variable not set"

export GROQ_API_KEY="your-key-here"

"ffmpeg not found"

brew install ffmpeg

API errors

Check your Groq API key is valid
Verify you have remaining quota on your Groq account
Check internet connectivity

Security Note

Never commit the GROQ_API_KEY to version control. Use environment variables or a secure secrets manager.

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Never commit the GROQ_API_KEY to version control. Use environment variables or a secure secrets manager.

Requirements

The skill requires these to be configured: ### 1. Groq API Key Get an API key from https://console.groq.com/ Set as environment variable: ```bash export GROQ_API_KEY="your-key-here" ``` For persistent setting, add to your shell profile (~/.zshrc or ~/.bashrc): ```bash echo 'export GROQ_API_KEY="your-key-here"' >> ~/.zshrc ``` ### 2. ffmpeg (recommended) ```bash brew install ffmpeg ``` Without ffmpeg, only WAV files will work. ffmpeg is used to convert other formats to WAV before sending to Groq.

FAQ

How do I install audio-transcriber?

Run openclaw add @snail3d/voice-devotional:temp-transcriber/audio-transcriber in your terminal. This installs audio-transcriber into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/snail3d/voice-devotional. Review commits and README documentation before installing.