skills$openclaw/elevenlabs-stt

7.5k★

elevenlabs-stt – OpenClaw Skill

Name: elevenlabs-stt
Author: clawdbotborges

elevenlabs-stt is an OpenClaw Skills integration for coding workflows. Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

7.5k stars3.5k forksSecurity L1

Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

name	elevenlabs-stt
description	Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2). OpenClaw Skills integration.
owner	clawdbotborges
repository	clawdbotborges/elevenlabs-stt
language	Markdown
license	MIT
topics
security	L1
install	openclaw add @clawdbotborges/elevenlabs-stt
last updated	Feb 7, 2026

Maintainer

clawdbotborges

Maintains elevenlabs-stt in the OpenClaw Skills directory.

View GitHub profile

File Explorer

5 files

scripts

transcribe.sh

2.3 KB

_meta.json

299 B

README.md

2.9 KB

SKILL.md

1.7 KB

SKILL.md

name: elevenlabs-stt description: Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2). homepage: https://elevenlabs.io/speech-to-text metadata: {"clawdbot":{"emoji":"🎙️","requires":{"bins":["curl"],"env":["ELEVENLABS_API_KEY"]},"primaryEnv":"ELEVENLABS_API_KEY"}}

ElevenLabs Speech-to-Text

Transcribe audio files using ElevenLabs' Scribe v2 model. Supports 90+ languages with speaker diarization.

Quick Start

# Basic transcription
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3

# With speaker diarization
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --diarize

# Specify language (improves accuracy)
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --lang en

# Full JSON output with timestamps
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --json

Options

Flag	Description
`--diarize`	Identify different speakers
`--lang CODE`	ISO language code (e.g., en, pt, es)
`--json`	Output full JSON with word timestamps
`--events`	Tag audio events (laughter, music, etc.)

Supported Formats

All major audio/video formats: mp3, m4a, wav, ogg, webm, mp4, etc.

API Key

Set ELEVENLABS_API_KEY environment variable, or configure in clawdbot.json:

{
  skills: {
    entries: {
      "elevenlabs-stt": {
        apiKey: "sk_..."
      }
    }
  }
}

Examples

# Transcribe a WhatsApp voice note
{baseDir}/scripts/transcribe.sh ~/Downloads/voice_note.ogg

# Meeting recording with multiple speakers
{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize --lang en

# Get JSON for processing
{baseDir}/scripts/transcribe.sh podcast.mp3 --json > transcript.json

README.md

🎙️ ElevenLabs Speech-to-Text Skill

A Clawdbot skill for transcribing audio files using ElevenLabs' Scribe v2 model.

Features

🌍 90+ languages supported with automatic detection
👥 Speaker diarization — identify different speakers
🎵 Audio event tagging — detect laughter, music, applause, etc.
📝 Word-level timestamps — precise timing in JSON output
🎧 All major formats — mp3, m4a, wav, ogg, webm, mp4, and more

Installation

For Clawdbot

Add to your clawdbot.json:

{
  skills: {
    entries: {
      "elevenlabs-stt": {
        source: "github:clawdbotborges/elevenlabs-stt",
        apiKey: "sk_your_api_key_here"
      }
    }
  }
}

Standalone

git clone https://github.com/clawdbotborges/elevenlabs-stt.git
cd elevenlabs-stt
export ELEVENLABS_API_KEY="sk_your_api_key_here"

Usage

# Basic transcription
./scripts/transcribe.sh audio.mp3

# With speaker diarization
./scripts/transcribe.sh meeting.mp3 --diarize

# Specify language for better accuracy
./scripts/transcribe.sh voice_note.ogg --lang en

# Full JSON with timestamps
./scripts/transcribe.sh podcast.mp3 --json

# Tag audio events (laughter, music, etc.)
./scripts/transcribe.sh recording.wav --events

Options

Flag	Description
`--diarize`	Enable speaker diarization
`--lang CODE`	ISO language code (e.g., `en`, `pt`, `es`, `fr`)
`--json`	Output full JSON response with word timestamps
`--events`	Tag audio events like laughter, music, applause
`-h, --help`	Show help message

Examples

Transcribe a voice message

./scripts/transcribe.sh ~/Downloads/voice_note.ogg
# Output: "Hey, just wanted to check in about the meeting tomorrow."

Meeting with multiple speakers

./scripts/transcribe.sh meeting.mp3 --diarize --lang en --json

{
  "text": "Welcome everyone. Let's start with updates.",
  "words": [
    {"text": "Welcome", "start": 0.0, "end": 0.5, "speaker": "speaker_0"},
    {"text": "everyone", "start": 0.5, "end": 1.0, "speaker": "speaker_0"}
  ]
}

Process with jq

# Get just the text
./scripts/transcribe.sh audio.mp3 --json | jq -r '.text'

# Get word count
./scripts/transcribe.sh audio.mp3 --json | jq '.words | length'

Requirements

curl — for API requests
jq — for JSON parsing (optional, but recommended)
ElevenLabs API key with Speech-to-Text access

API Key

Get your API key from ElevenLabs:

Sign up or log in
Go to Profile → API Keys
Create a new key or copy existing one

License

MIT

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

OpenClaw CLI installed and configured.
Language: Markdown
License: MIT
Topics:

FAQ

How do I install elevenlabs-stt?

Run openclaw add @clawdbotborges/elevenlabs-stt in your terminal. This installs elevenlabs-stt into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/clawdbotborges/elevenlabs-stt. Review commits and README documentation before installing.

elevenlabs-stt – OpenClaw Skill

Skill Snapshot

Maintainer

name: elevenlabs-stt description: Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2). homepage: https://elevenlabs.io/speech-to-text metadata: {"clawdbot":{"emoji":"🎙️","requires":{"bins":["curl"],"env":["ELEVENLABS_API_KEY"]},"primaryEnv":"ELEVENLABS_API_KEY"}}

ElevenLabs Speech-to-Text

Quick Start

Options

Supported Formats

API Key

Examples

🎙️ ElevenLabs Speech-to-Text Skill

Features

Installation

For Clawdbot

Standalone

Usage

Options

Examples

Transcribe a voice message

Meeting with multiple speakers

Process with jq

Requirements

API Key

License

Links

Permissions & Security

Requirements

FAQ

How do I install elevenlabs-stt?

Does this skill run locally or in the cloud?

Where can I verify the source code?