skills$openclaw/text-to-speech

9.0k★

by okaris

text-to-speech – OpenClaw Skill

Name: text-to-speech
Author: okaris

text-to-speech is an OpenClaw Skills integration for writing workflows. |

9.0k stars8.4k forksSecurity L1

Updated Feb 7, 2026Created Feb 7, 2026writing

Skill Snapshot

name	text-to-speech
description	\| OpenClaw Skills integration.
owner	okaris
repository	okaris/inference-shpath: text-to-speech
language	Markdown
license	MIT
topics
security	L1
install	openclaw add @okaris/inference-sh:text-to-speech
last updated	Feb 7, 2026

Maintainer

okaris

Maintains text-to-speech in the OpenClaw Skills directory.

View GitHub profile

File Explorer

1 files

text-to-speech

SKILL.md

3.3 KB

SKILL.md

name: text-to-speech description: | Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Higgs Audio, VibeVoice (podcasts). Capabilities: text-to-speech, voice cloning, multi-speaker dialogue, podcast generation, expressive speech. Use for: voiceovers, audiobooks, podcasts, accessibility, video narration, IVR, voice assistants. Triggers: text to speech, tts, voice generation, ai voice, speech synthesis, voice over, generate speech, ai narrator, voice cloning, text to audio, elevenlabs alternative, voice ai, ai voiceover, speech generator, natural voice allowed-tools: Bash(infsh *)

Text-to-Speech

Convert text to natural speech via inference.sh CLI.

Quick Start

# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login

# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'

Available Models

Model	App ID	Best For
DIA TTS	`infsh/dia-tts`	Conversational, expressive
Kokoro TTS	`infsh/kokoro-tts`	Fast, natural
Chatterbox	`infsh/chatterbox`	General purpose
Higgs Audio	`infsh/higgs-audio`	Emotional control
VibeVoice	`infsh/vibevoice`	Podcasts, long-form

Browse All Audio Apps

infsh app list --category audio

Examples

Basic Text-to-Speech

infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'

Conversational TTS with DIA

infsh app sample infsh/dia-tts --save input.json

# Edit input.json:
# {
#   "text": "Hey! How are you doing today? I'm really excited to share this with you.",
#   "voice": "conversational"
# }

infsh app run infsh/dia-tts --input input.json

Long-form Audio (Podcasts)

infsh app sample infsh/vibevoice --save input.json

# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.json

Expressive Speech with Higgs

infsh app sample infsh/higgs-audio --save input.json

# {
#   "text": "This is absolutely incredible!",
#   "emotion": "excited"
# }

infsh app run infsh/higgs-audio --input input.json

Use Cases

Voiceovers: Product demos, explainer videos
Audiobooks: Convert text to spoken word
Podcasts: Generate podcast episodes
Accessibility: Make content accessible
IVR: Phone system voice prompts
Video Narration: Add narration to videos

Generate speech, then create a talking head video:

# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json

# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "<audio-url-from-step-1>"
}'

# Full platform skill (all 100+ apps)
npx skills add inference-sh/skills@inference-sh

# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/skills@ai-avatar-video

# AI music generation
npx skills add inference-sh/skills@ai-music-generation

# Speech-to-text (transcription)
npx skills add inference-sh/skills@speech-to-text

# Video generation
npx skills add inference-sh/skills@ai-video-generation

Browse all apps: infsh app list

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

OpenClaw CLI installed and configured.
Language: Markdown
License: MIT
Topics:

FAQ

How do I install text-to-speech?

Run openclaw add @okaris/inference-sh:text-to-speech in your terminal. This installs text-to-speech into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/okaris/inference-sh. Review commits and README documentation before installing.