skills$openclaw/parallel-extract
normallygaussian5.2k

by normallygaussian

parallel-extract – OpenClaw Skill

parallel-extract is an OpenClaw Skills integration for coding workflows. URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output.

5.2k stars2.8k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

nameparallel-extract
descriptionURL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output. OpenClaw Skills integration.
ownernormallygaussian
repositorynormallygaussian/parallel-extract
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @normallygaussian/parallel-extract
last updatedFeb 7, 2026

Maintainer

normallygaussian

normallygaussian

Maintains parallel-extract in the OpenClaw Skills directory.

View GitHub profile
File Explorer
2 files
.
_meta.json
294 B
SKILL.md
3.8 KB
SKILL.md

name: parallel-extract description: "URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output." homepage: https://parallel.ai

Parallel Extract

Extract clean, LLM-ready content from URLs. Handles webpages, articles, PDFs, and JavaScript-heavy sites that need rendering.

When to Use

Trigger this skill when the user asks for:

  • "read this URL", "fetch this page", "extract from..."
  • "get the content from [URL]"
  • "what does this article say?"
  • Reading PDFs, JS-heavy pages, or paywalled content
  • Getting clean markdown from messy web pages

Use Search to discover; use Extract to read.

Quick Start

parallel-cli extract "https://example.com/article" --json

CLI Reference

Basic Usage

parallel-cli extract "<url>" [options]

Common Flags

FlagDescription
--url "<url>"URL to extract (repeatable, max 10)
--objective "<focus>"Focus extraction on specific content
--jsonOutput as JSON
--excerpts / --no-excerptsInclude relevant excerpts (default: on)
--full-content / --no-full-contentInclude full page content
--excerpts-max-chars NMax chars per excerpt
--excerpts-max-total-chars NMax total excerpt chars
--full-max-chars NMax full content chars
-o <file>Save output to file

Examples

Basic extraction:

parallel-cli extract "https://example.com/article" --json

Focused extraction:

parallel-cli extract "https://example.com/pricing" \
  --objective "pricing tiers and features" \
  --json

Full content for PDFs:

parallel-cli extract "https://example.com/whitepaper.pdf" \
  --full-content \
  --json

Multiple URLs:

parallel-cli extract \
  --url "https://example.com/page1" \
  --url "https://example.com/page2" \
  --json

Default Workflow

  1. Search with an objective + keyword queries
  2. Inspect titles/URLs/dates; choose the best sources
  3. Extract the specific pages you need (top N URLs)
  4. Answer using the extracted excerpts/content

Best-Practice Prompting

Objective

When extracting, provide context:

  • What specific information you're looking for
  • Why you need it (helps focus extraction)

Good: --objective "Find the installation steps and system requirements"

Poor: --objective "Read the page"

Response Format

Returns structured JSON with:

  • url — source URL
  • title — page title
  • excerpts[] — relevant text excerpts (if enabled)
  • full_content — complete page content (if enabled)
  • publish_date — when available

Output Handling

When turning extracted content into a user-facing answer:

  • Keep content verbatim — do not paraphrase unnecessarily
  • Extract ALL list items exhaustively
  • Strip noise: nav menus, footers, ads, "click here" links
  • Preserve all facts, names, numbers, dates, quotes
  • Include URL + publish_date for transparency

Running Out of Context?

For long conversations, save results and use sessions_spawn:

parallel-cli extract "<url>" --json -o /tmp/extract-<topic>.json

Then spawn a sub-agent:

{
  "tool": "sessions_spawn",
  "task": "Read /tmp/extract-<topic>.json and summarize the key content.",
  "label": "extract-summary"
}

Error Handling

Exit CodeMeaning
0Success
1Unexpected error (network, parse)
2Invalid arguments
3API error (non-2xx)

Prerequisites

  1. Get an API key at parallel.ai
  2. Install the CLI:
curl -fsSL https://parallel.ai/install.sh | bash
export PARALLEL_API_KEY=your-key

References

README.md

No README available.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

1. Get an API key at [parallel.ai](https://parallel.ai) 2. Install the CLI: ```bash curl -fsSL https://parallel.ai/install.sh | bash export PARALLEL_API_KEY=your-key ```

FAQ

How do I install parallel-extract?

Run openclaw add @normallygaussian/parallel-extract in your terminal. This installs parallel-extract into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/normallygaussian/parallel-extract. Review commits and README documentation before installing.