skills$openclaw/audio-transcriber
snail3d6.9k

by snail3d

audio-transcriber – OpenClaw Skill

audio-transcriber is an OpenClaw Skills integration for coding workflows. Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable.

6.9k stars4.6k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

nameaudio-transcriber
descriptionTranscribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable. OpenClaw Skills integration.
ownersnail3d
repositorysnail3d/voice-devotionalpath: temp-transcriber/audio-transcriber
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @snail3d/voice-devotional:temp-transcriber/audio-transcriber
last updatedFeb 7, 2026

Maintainer

snail3d

snail3d

Maintains audio-transcriber in the OpenClaw Skills directory.

View GitHub profile
File Explorer
3 files
audio-transcriber
scripts
transcribe.py
4.1 KB
SKILL.md
3.1 KB
SKILL.md

name: audio-transcriber description: Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable.

Audio Transcriber

Overview

This skill enables fast audio transcription using Groq's Whisper API. Transcription happens in the cloud via Groq's infrastructure, providing significantly faster results than local Whisper models.

Quick Start

When a user sends an audio file or voice message:

  1. Ensure GROQ_API_KEY is set in environment
  2. Use the transcribe script: scripts/transcribe.py /path/to/audio.ogg
  3. Return the transcribed text to the user

Usage

Basic Transcription

export GROQ_API_KEY="your-key-here"
python3 /path/to/audio-transcriber/scripts/transcribe.py /path/to/audio.ogg

The script:

  • Accepts any audio format (ogg, mp3, wav, m4a, etc.)
  • Automatically converts to WAV (16kHz, mono) using ffmpeg (if available)
  • Sends to Groq's Whisper API for transcription
  • Outputs plain text to stdout

Supported Audio Formats

  • Voice messages: OGG (Telegram, Signal, etc.)
  • Common formats: MP3, WAV, M4A, FLAC, AAC
  • Container formats: The script handles conversion automatically if ffmpeg is installed
  • Without ffmpeg: Only WAV files are supported

Setup Requirements

The skill requires these to be configured:

Get an API key from https://console.groq.com/

Set as environment variable:

export GROQ_API_KEY="your-key-here"

For persistent setting, add to your shell profile (~/.zshrc or ~/.bashrc):

echo 'export GROQ_API_KEY="your-key-here"' >> ~/.zshrc

2. ffmpeg (recommended)

brew install ffmpeg

Without ffmpeg, only WAV files will work. ffmpeg is used to convert other formats to WAV before sending to Groq.

Resources

scripts/transcribe.py

Main transcription script that:

  • Validates GROQ_API_KEY environment variable
  • Checks for ffmpeg (optional but recommended)
  • Converts audio to WAV format if needed
  • Sends to Groq's Whisper API (whisper-large-v3 model)
  • Extracts and outputs plain text

Run directly from command line or via exec tool.

Performance Notes

  • Speed: Much faster than local Whisper (typically <1 second for short messages)
  • Model: Uses whisper-large-v3 via Groq API (high accuracy)
  • Latency: Cloud-based, depends on internet connection
  • Cost: Groq offers free tier; check current pricing for usage limits
  • Accuracy: Excellent for general speech; handles:
    • Multiple accents and dialects
    • Multiple speakers (moderately)
    • Noisy environments
    • Technical jargon

Troubleshooting

"GROQ_API_KEY environment variable not set"

export GROQ_API_KEY="your-key-here"

"ffmpeg not found"

brew install ffmpeg

API errors

  • Check your Groq API key is valid
  • Verify you have remaining quota on your Groq account
  • Check internet connectivity

Security Note

Never commit the GROQ_API_KEY to version control. Use environment variables or a secure secrets manager.

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Never commit the GROQ_API_KEY to version control. Use environment variables or a secure secrets manager.

Requirements

The skill requires these to be configured: ### 1. Groq API Key Get an API key from https://console.groq.com/ Set as environment variable: ```bash export GROQ_API_KEY="your-key-here" ``` For persistent setting, add to your shell profile (~/.zshrc or ~/.bashrc): ```bash echo 'export GROQ_API_KEY="your-key-here"' >> ~/.zshrc ``` ### 2. ffmpeg (recommended) ```bash brew install ffmpeg ``` Without ffmpeg, only WAV files will work. ffmpeg is used to convert other formats to WAV before sending to Groq.

FAQ

How do I install audio-transcriber?

Run openclaw add @snail3d/voice-devotional:temp-transcriber/audio-transcriber in your terminal. This installs audio-transcriber into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/snail3d/voice-devotional. Review commits and README documentation before installing.