skills$openclaw/extract

5.2k★

extract – OpenClaw Skill

Name: extract
Author: barneyjm

extract is an OpenClaw Skills integration for coding workflows. Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages. Use when you have specific URLs and need their content without writing code.

5.2k stars6.9k forksSecurity L1

Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

name	extract
description	Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages. Use when you have specific URLs and need their content without writing code. OpenClaw Skills integration.
owner	barneyjm
repository	barneyjm/extract
language	Markdown
license	MIT
topics
security	L1
install	openclaw add @barneyjm/extract
last updated	Feb 7, 2026

Maintainer

barneyjm

Maintains extract in the OpenClaw Skills directory.

View GitHub profile

File Explorer

4 files

scripts

extract.sh

1.6 KB

_meta.json

268 B

SKILL.md

4.6 KB

SKILL.md

name: extract description: "Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages. Use when you have specific URLs and need their content without writing code."

Extract Skill

Extract clean content from specific URLs. Ideal when you know which pages you want content from.

Prerequisites

Tavily API Key Required - Get your key at https://tavily.com

Add to ~/.claude/settings.json:

{
  "env": {
    "TAVILY_API_KEY": "tvly-your-api-key-here"
  }
}

Quick Start

Using the Script

./scripts/extract.sh '<json>'

Examples:

# Single URL
./scripts/extract.sh '{"urls": ["https://example.com/article"]}'

# Multiple URLs
./scripts/extract.sh '{"urls": ["https://example.com/page1", "https://example.com/page2"]}'

# With query focus and chunks
./scripts/extract.sh '{"urls": ["https://example.com/docs"], "query": "authentication API", "chunks_per_source": 3}'

# Advanced extraction for JS pages
./scripts/extract.sh '{"urls": ["https://app.example.com"], "extract_depth": "advanced", "timeout": 60}'

Basic Extraction

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://example.com/article"]
  }'

Multiple URLs with Query Focus

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/ml-healthcare",
      "https://example.com/ai-diagnostics"
    ],
    "query": "AI diagnostic tools accuracy",
    "chunks_per_source": 3
  }'

API Reference

Endpoint

POST https://api.tavily.com/extract

Headers

Header	Value
`Authorization`	`Bearer <TAVILY_API_KEY>`
`Content-Type`	`application/json`

Request Body

Field	Type	Default	Description
`urls`	array	Required	URLs to extract (max 20)
`query`	string	null	Reranks chunks by relevance
`chunks_per_source`	integer	3	Chunks per URL (1-5, requires query)
`extract_depth`	string	`"basic"`	`basic` or `advanced` (for JS pages)
`format`	string	`"markdown"`	`markdown` or `text`
`include_images`	boolean	false	Include image URLs
`timeout`	float	varies	Max wait (1-60 seconds)

Response Format

{
  "results": [
    {
      "url": "https://example.com/article",
      "raw_content": "# Article Title\n\nContent..."
    }
  ],
  "failed_results": [],
  "response_time": 2.3
}

Extract Depth

Depth	When to Use
`basic`	Simple text extraction, faster
`advanced`	Dynamic/JS-rendered pages, tables, structured data

Examples

Single URL Extraction

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://docs.python.org/3/tutorial/classes.html"],
    "extract_depth": "basic"
  }'

Targeted Extraction with Query

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/react-hooks",
      "https://example.com/react-state"
    ],
    "query": "useState and useEffect patterns",
    "chunks_per_source": 2
  }'

JavaScript-Heavy Pages

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://app.example.com/dashboard"],
    "extract_depth": "advanced",
    "timeout": 60
  }'

Batch Extraction

curl --request POST \
  --url https://api.tavily.com/extract \
  --header "Authorization: Bearer $TAVILY_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/page1",
      "https://example.com/page2",
      "https://example.com/page3",
      "https://example.com/page4",
      "https://example.com/page5"
    ],
    "extract_depth": "basic"
  }'

Tips

Max 20 URLs per request - batch larger lists
Use query + chunks_per_source to get only relevant content
Try basic first, fall back to advanced if content is missing
Set longer timeout for slow pages (up to 60s)
Check failed_results for URLs that couldn't be extracted

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

**Tavily API Key Required** - Get your key at https://tavily.com Add to `~/.claude/settings.json`: ```json { "env": { "TAVILY_API_KEY": "tvly-your-api-key-here" } } ```

FAQ

How do I install extract?

Run openclaw add @barneyjm/extract in your terminal. This installs extract into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/barneyjm/extract. Review commits and README documentation before installing.