skills$openclaw/tubescribe
matusvojtek2.0kā˜…

by matusvojtek

tubescribe – OpenClaw Skill

tubescribe is an OpenClaw Skills integration for writing workflows. YouTube video summarizer with speaker detection, formatted documents, and audio output. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.

2.0k stars5.8k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026writing

Skill Snapshot

nametubescribe
descriptionYouTube video summarizer with speaker detection, formatted documents, and audio output. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video. OpenClaw Skills integration.
ownermatusvojtek
repositorymatusvojtek/tubescribe
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @matusvojtek/tubescribe
last updatedFeb 7, 2026

Maintainer

matusvojtek

matusvojtek

Maintains tubescribe in the OpenClaw Skills directory.

View GitHub profile
File Explorer
7 files
.
scripts
config.py
5.4 KB
html_writer.py
6.1 KB
setup.py
10.8 KB
tubescribe.py
13.4 KB
_meta.json
631 B
SKILL.md
5.3 KB
SKILL.md

name: tubescribe description: "YouTube video summarizer with speaker detection, formatted documents, and audio output. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video."

TubeScribe šŸŽ¬

Turn any YouTube video into a polished document + audio summary in seconds.

Drop a YouTube link → get a beautiful transcript with speaker labels, key quotes, timestamps that link back to the video, and an audio summary you can listen to on the go.

šŸ’ø 100% Free & Local

  • No subscription — runs entirely on your machine
  • No API keys required — works out of the box
  • No data leaves your computer — your content stays private
  • No usage limits — summarize as many videos as you want

✨ Features

  • šŸŽÆ Smart Speaker Detection — Automatically identifies who's talking in interviews, podcasts, and conversations
  • šŸ“ Clickable Timestamps — Every quote links directly to that moment in the video
  • šŸ“„ Clean Documents — Export as HTML, DOCX, or Markdown
  • šŸ”Š Audio Summaries — Listen to the key points (MP3/WAV)
  • šŸš€ Zero Config — Works out of the box, upgrades available for power users

šŸŽ¬ Works With Any Video

  • Interviews & podcasts (multi-speaker detection)
  • Lectures & tutorials (single speaker)
  • Music videos (lyrics extraction)
  • News & documentaries
  • Any YouTube content with captions

Quick Start

When user sends a YouTube URL, run the full pipeline automatically:

# 1. Extract transcript
python skills/tubescribe/scripts/tubescribe.py "YOUTUBE_URL"

This creates:

  • /tmp/tubescribe_{video_id}_source.json — metadata + transcript
  • /tmp/tubescribe_{video_id}_output.md — path for output

Then process with sub-agent (see workflow below).

First-Time Setup

Run setup to check dependencies and configure defaults:

python skills/tubescribe/scripts/setup.py

This checks: summarize CLI, pandoc/python-docx, ffmpeg, Kokoro TTS

Full Workflow

Step 1: Extract Transcript

python skills/tubescribe/scripts/tubescribe.py "https://youtube.com/watch?v=VIDEO_ID"

Step 2: Process with Sub-Agent

Spawn a sub-agent to analyze and format:

sessions_spawn(
    task="""Read /tmp/tubescribe_{video_id}_source.json and create formatted output.

**Output to:** /tmp/tubescribe_{video_id}_output.md

**Format:**
1. # Title (from metadata)
2. ## Participants — identify speakers from context
3. ## Summary — 3-5 paragraphs covering main topics
4. ## Key Quotes — 5 best quotes with timestamps [[MM:SS]](https://youtu.be/{video_id}?t=SECONDS)
5. ## Full Transcript — ALL segments with:
   - Speaker labels (**Name:** ) when identifiable
   - Clickable timestamps: [[0:42]](https://youtu.be/{video_id}?t=42)
   - Convert MM:SS to seconds for links

**Speaker Detection:**
- Use context clues (questions vs answers, explicit names, speaking patterns)
- For single-speaker videos, use narrator label or skip speaker labels
- For interviews: host asks questions, guest gives longer answers
""",
    label="tubescribe",
    runTimeoutSeconds=600,
    cleanup="delete"
)

Step 3: Create Document

Convert markdown to final format:

# HTML (no dependencies beyond Python)
python skills/tubescribe/scripts/html_writer.py /tmp/tubescribe_{video_id}_output.md output.html

# DOCX with pandoc (best formatting)
pandoc /tmp/tubescribe_{video_id}_output.md -o output.docx

# Markdown (just copy the file)
cp /tmp/tubescribe_{video_id}_output.md output.md

Step 4: Generate Audio Summary (Optional)

Extract summary section and generate TTS:

# Read summary from output markdown
# Generate audio using Kokoro (preferred) or built-in TTS
# Save to {output_dir}/{title}_summary.wav or .mp3

Step 5: Open Results

open output.html  # or .docx or .md
open -a "QuickTime Player" output_summary.wav

Configuration

Config file: ~/.tubescribe/config.json

{
  "output": {
    "folder": "~/Documents/TubeScribe",
    "open_folder_after": true
  },
  "document": {
    "format": "docx"
  },
  "audio": {
    "enabled": true,
    "format": "mp3",
    "tts_engine": "kokoro"
  }
}

Options:

  • output.folder: Where to save files (default: ~/Documents/TubeScribe)
  • document.format: html (default, no deps), docx (with pandoc/python-docx), md (raw markdown)
  • audio.format: mp3 (with ffmpeg), wav (default without ffmpeg)
  • audio.tts_engine: builtin (macOS say), kokoro (high quality)

Output Structure

~/Documents/TubeScribe/
ā”œā”€ā”€ {Video Title}.html         # Formatted document (or .docx / .md)
└── {Video Title}_summary.mp3  # Audio summary (or .wav)

After generation, opens the folder (not individual files) so you can access everything.

Dependencies

Required:

  • summarize CLI — brew install steipete/tap/summarize
  • Python 3.8+

Optional (better quality):

  • pandoc — DOCX output: brew install pandoc
  • ffmpeg — MP3 audio: brew install ffmpeg
  • Kokoro TTS — High-quality voices: see https://github.com/hexgrad/kokoro

Tips

  • For long videos (>30 min), increase sub-agent timeout to 900s
  • Speaker detection works best with clear interview/podcast formats
  • Single-speaker videos (tutorials, lectures) skip speaker labels automatically
  • Timestamps link directly to YouTube at that moment
README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

**Required:** - `summarize` CLI — `brew install steipete/tap/summarize` - Python 3.8+ **Optional (better quality):** - `pandoc` — DOCX output: `brew install pandoc` - `ffmpeg` — MP3 audio: `brew install ffmpeg` - Kokoro TTS — High-quality voices: see https://github.com/hexgrad/kokoro

Configuration

Config file: `~/.tubescribe/config.json` ```json { "output": { "folder": "~/Documents/TubeScribe", "open_folder_after": true }, "document": { "format": "docx" }, "audio": { "enabled": true, "format": "mp3", "tts_engine": "kokoro" } } ``` Options: - `output.folder`: Where to save files (default: `~/Documents/TubeScribe`) - `document.format`: `html` (default, no deps), `docx` (with pandoc/python-docx), `md` (raw markdown) - `audio.format`: `mp3` (with ffmpeg), `wav` (default without ffmpeg) - `audio.tts_engine`: `builtin` (macOS say), `kokoro` (high quality)

FAQ

How do I install tubescribe?

Run openclaw add @matusvojtek/tubescribe in your terminal. This installs tubescribe into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/matusvojtek/tubescribe. Review commits and README documentation before installing.