skills$openclaw/mlx-audio-server
guoqiao6.5k

by guoqiao

mlx-audio-server – OpenClaw Skill

mlx-audio-server is an OpenClaw Skills integration for coding workflows. A fast, accurate, and fully local OpenAI-compatible API server for speech-to-text and text-to-speech, powered by MLX on Apple Silicon and open-source models.

6.5k stars495 forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

namemlx-audio-server
descriptionA fast, accurate, and fully local OpenAI-compatible API server for speech-to-text and text-to-speech, powered by MLX on Apple Silicon and open-source models. OpenClaw Skills integration.
ownerguoqiao
repositoryguoqiao/mlx-audio-server
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @guoqiao/mlx-audio-server
last updatedFeb 7, 2026

Maintainer

guoqiao

guoqiao

Maintains mlx-audio-server in the OpenClaw Skills directory.

View GitHub profile
File Explorer
5 files
.
_meta.json
463 B
install.sh
1.1 KB
run_stt.sh
619 B
run_tts.sh
840 B
SKILL.md
2.3 KB
SKILL.md

name: mlx-audio-server description: A fast, accurate, and fully local OpenAI-compatible API server for speech-to-text and text-to-speech, powered by MLX on Apple Silicon and open-source models. metadata: {"openclaw":{"always":true,"emoji":"🦞","homepage":"https://github.com/guoqiao/skills/blob/main/mlx-audio-server/src/SKILL.md","os":["darwin"],"tags":["latest","asr","stt","speech-to-text","tts","text-to-speech","mlx","audio","mlx-audio","glm","glm-asr","glm-asr-nano-2512","glm-asr-nano-2512-8bit","macOS","MacBook","Mac mini","Apple Silicon","server","local","openai","api","compatible","openai-compatible","transcription"],"requires":{"bins":["brew"]}}}

MLX Audio Server

mlx-audio: The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon.

This skill will run it as a OpenAI-compatible API server on macOS in background, and provide scripts/examples for AI agents to use the api.

Default Models:

  • Speech-To-Text: mlx-community/glm-asr-nano-2512-8bit
  • Text-To-Speech: mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16

The server will download these models when needed, so first run will be a bit slow.

More choices here: https://github.com/Blaizzy/mlx-audio?tab=readme-ov-file#supported-models

Requirements

  • mlx: macOS with Apple Silicon
  • brew: used to install deps if not available

Installation

bash ${baseDir}/install.sh

This script will:

  • clone (forked) mlx-audio repo into ~/opt/mlx-audio
  • use uv to create a venv and install deps in it: ~/opt/mlx-audio/.venv
  • create a plist file to run mlx-audio server as a launchd service in background in user domain
  • run as a OpenAI compatible API server, on port 8899 by default.

Usage

STT/Speech-To-Text:

# input will be converted to wav with ffmpeg, if not yet.
# output will be transcript text only.
bash ${baseDir}/run_stt.sh <audio_or_video_path>

TTS/Text-To-Speech:

# audio will be saved into a tmp dir, with default name `speech.wav`, and print to stdout.
bash ${baseDir}/run_tts.sh "Hello, Human!"
# or you can specify a output dir
bash ${baseDir}/run_tts.sh "Hello, Human!" ./output
# output will be audio path only.

You can use both scripts directly, or as example/reference.

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

- `mlx`: macOS with Apple Silicon - `brew`: used to install deps if not available

FAQ

How do I install mlx-audio-server?

Run openclaw add @guoqiao/mlx-audio-server in your terminal. This installs mlx-audio-server into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/guoqiao/mlx-audio-server. Review commits and README documentation before installing.