skills$openclaw/parallel-extract

5.2k★

parallel-extract – OpenClaw Skill

Name: parallel-extract
Author: normallygaussian

parallel-extract is an OpenClaw Skills integration for coding workflows. URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output.

5.2k stars2.8k forksSecurity L1

Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

name	parallel-extract
description	URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output. OpenClaw Skills integration.
owner	normallygaussian
repository	normallygaussian/parallel-extract
language	Markdown
license	MIT
topics
security	L1
install	openclaw add @normallygaussian/parallel-extract
last updated	Feb 7, 2026

Maintainer

normallygaussian

Maintains parallel-extract in the OpenClaw Skills directory.

View GitHub profile

File Explorer

2 files

_meta.json

294 B

SKILL.md

3.8 KB

SKILL.md

name: parallel-extract description: "URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output." homepage: https://parallel.ai

Parallel Extract

Extract clean, LLM-ready content from URLs. Handles webpages, articles, PDFs, and JavaScript-heavy sites that need rendering.

When to Use

Trigger this skill when the user asks for:

"read this URL", "fetch this page", "extract from..."
"get the content from [URL]"
"what does this article say?"
Reading PDFs, JS-heavy pages, or paywalled content
Getting clean markdown from messy web pages

Use Search to discover; use Extract to read.

Quick Start

parallel-cli extract "https://example.com/article" --json

CLI Reference

Basic Usage

parallel-cli extract "<url>" [options]

Common Flags

Flag	Description
`--url "<url>"`	URL to extract (repeatable, max 10)
`--objective "<focus>"`	Focus extraction on specific content
`--json`	Output as JSON
`--excerpts` / `--no-excerpts`	Include relevant excerpts (default: on)
`--full-content` / `--no-full-content`	Include full page content
`--excerpts-max-chars N`	Max chars per excerpt
`--excerpts-max-total-chars N`	Max total excerpt chars
`--full-max-chars N`	Max full content chars
`-o <file>`	Save output to file

Examples

Basic extraction:

parallel-cli extract "https://example.com/article" --json

Focused extraction:

parallel-cli extract "https://example.com/pricing" \
  --objective "pricing tiers and features" \
  --json

Full content for PDFs:

parallel-cli extract "https://example.com/whitepaper.pdf" \
  --full-content \
  --json

Multiple URLs:

parallel-cli extract \
  --url "https://example.com/page1" \
  --url "https://example.com/page2" \
  --json

Default Workflow

Search with an objective + keyword queries
Inspect titles/URLs/dates; choose the best sources
Extract the specific pages you need (top N URLs)
Answer using the extracted excerpts/content

Best-Practice Prompting

Objective

When extracting, provide context:

What specific information you're looking for
Why you need it (helps focus extraction)

Good: --objective "Find the installation steps and system requirements"

Poor: --objective "Read the page"

Response Format

Returns structured JSON with:

url — source URL
title — page title
excerpts[] — relevant text excerpts (if enabled)
full_content — complete page content (if enabled)
publish_date — when available

Output Handling

When turning extracted content into a user-facing answer:

Keep content verbatim — do not paraphrase unnecessarily
Extract ALL list items exhaustively
Strip noise: nav menus, footers, ads, "click here" links
Preserve all facts, names, numbers, dates, quotes
Include URL + publish_date for transparency

Running Out of Context?

For long conversations, save results and use sessions_spawn:

parallel-cli extract "<url>" --json -o /tmp/extract-<topic>.json

Then spawn a sub-agent:

{
  "tool": "sessions_spawn",
  "task": "Read /tmp/extract-<topic>.json and summarize the key content.",
  "label": "extract-summary"
}

Error Handling

Exit Code	Meaning
0	Success
1	Unexpected error (network, parse)
2	Invalid arguments
3	API error (non-2xx)

Prerequisites

Get an API key at parallel.ai
Install the CLI:

curl -fsSL https://parallel.ai/install.sh | bash
export PARALLEL_API_KEY=your-key

References

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

1. Get an API key at [parallel.ai](https://parallel.ai) 2. Install the CLI: ```bash curl -fsSL https://parallel.ai/install.sh | bash export PARALLEL_API_KEY=your-key ```

FAQ

How do I install parallel-extract?

Run openclaw add @normallygaussian/parallel-extract in your terminal. This installs parallel-extract into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/normallygaussian/parallel-extract. Review commits and README documentation before installing.