skills$openclaw/doc-search
sanvibyfish3.1k

by sanvibyfish

doc-search – OpenClaw Skill

doc-search is an OpenClaw Skills integration for coding workflows. 文档内容检索技能,用于在项目文件中搜索和定位信息。当需要在本地文档中查找关键词、搜索项目文件内容、定位特定信息、或需要快速了解项目文档结构时使用此技能。支持多策略搜索(文件名、标题、内容)、增量索引、上下文返回。适用于 Markdown、文本文件、代码文件等。

3.1k stars9.4k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

namedoc-search
description文档内容检索技能,用于在项目文件中搜索和定位信息。当需要在本地文档中查找关键词、搜索项目文件内容、定位特定信息、或需要快速了解项目文档结构时使用此技能。支持多策略搜索(文件名、标题、内容)、增量索引、上下文返回。适用于 Markdown、文本文件、代码文件等。 OpenClaw Skills integration.
ownersanvibyfish
repositorysanvibyfish/doc-search-skill
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @sanvibyfish/doc-search-skill
last updatedFeb 7, 2026

Maintainer

sanvibyfish

sanvibyfish

Maintains doc-search in the OpenClaw Skills directory.

View GitHub profile
File Explorer
10 files
.
references
search-strategies.md
2.2 KB
scripts
indexer.py
8.1 KB
quick_search.sh
1.2 KB
search.py
10.8 KB
_meta.json
283 B
README.md
1.7 KB
README.zh.md
1.7 KB
SKILL.md
3.3 KB
SKILL.md

name: doc-search description: 文档内容检索技能,用于在项目文件中搜索和定位信息。当需要在本地文档中查找关键词、搜索项目文件内容、定位特定信息、或需要快速了解项目文档结构时使用此技能。支持多策略搜索(文件名、标题、内容)、增量索引、上下文返回。适用于 Markdown、文本文件、代码文件等。

Doc Search 文档检索

轻量级本地文档检索,无需向量数据库。

核心能力

  1. 多策略搜索 - 文件名 > 标题/frontmatter > 正文内容,按相关性排序
  2. 增量索引 - 基于文件修改时间,只更新变化的文件
  3. 上下文返回 - 返回匹配行及前后 N 行
  4. TF-IDF 排序 - 可选的轻量相关性排序

快速使用

直接搜索(无需索引)

# 简单关键词搜索
python scripts/search.py "关键词" /path/to/docs

# 带上下文
python scripts/search.py "关键词" /path/to/docs --context 3

# 限制文件类型
python scripts/search.py "关键词" /path/to/docs --types md,txt

使用索引(大型项目推荐)

# 构建索引
python scripts/indexer.py /path/to/docs --output index.json

# 基于索引搜索(更快)
python scripts/search.py "关键词" --index index.json

搜索策略

搜索按以下优先级返回结果:

优先级匹配类型权重
1文件名完全匹配100
2文件名包含80
3标题/H1 匹配70
4Frontmatter 匹配60
5正文内容匹配40

多次匹配会累加权重。

输出格式

{
  "query": "搜索词",
  "total": 5,
  "results": [
    {
      "file": "docs/guide.md",
      "score": 150,
      "matches": [
        {
          "type": "title",
          "line": 1,
          "content": "# 用户指南",
          "context": ["---", "title: 用户指南", "---"]
        },
        {
          "type": "content", 
          "line": 42,
          "content": "这是关于搜索词的说明",
          "context": ["上一行", "这是关于搜索词的说明", "下一行"]
        }
      ]
    }
  ]
}

配置文件(可选)

在项目根目录创建 .docsearch.yaml

# 要索引的目录
include:
  - docs/
  - README.md
  - src/**/*.md

# 排除的模式
exclude:
  - node_modules/
  - "*.min.js"
  - .git/

# 文件类型
types:
  - md
  - txt
  - py
  - js

# 索引选项
index:
  max_file_size: 1mb
  extract_frontmatter: true
  extract_headings: true

高级用法

组合 Bash 工具

当不需要索引时,直接用 ripgrep:

# 快速搜索
rg "关键词" --type md -C 2 --json

# 搜索标题
rg "^#.*关键词" --type md

# 搜索 frontmatter
rg -U "^---[\s\S]*?关键词[\s\S]*?^---" --type md

Python API

from scripts.search import DocSearch

searcher = DocSearch("/path/to/docs")
results = searcher.search("关键词", context_lines=2)

for r in results:
    print(f"{r['file']} (score: {r['score']})")
    for m in r['matches']:
        print(f"  L{m['line']}: {m['content']}")

适用场景

  • 项目文档快速定位
  • 代码库中查找注释/文档
  • 笔记系统内容检索
  • 配置文件搜索

限制

  • 不支持语义搜索(需要向量数据库)
  • 中文分词依赖简单切分
  • 大文件(>1MB)建议跳过
README.md

Doc Search Skill

Lightweight local document search without a vector database. This repo includes a Codex skill definition plus simple Python tools for indexing and searching docs.

What it does

  • Multi-strategy search: filename, title/headings, frontmatter, content
  • Incremental indexing based on file mtime
  • Context lines around matches
  • Simple scoring and optional TF-IDF hooks

Structure

  • SKILL.md: Codex skill description and usage
  • scripts/search.py: search tool (direct search or index-based)
  • scripts/indexer.py: builds an index (JSON)
  • scripts/quick_search.sh: fast rg-based search shortcut

Requirements

  • Python 3.8+ (for search.py and indexer.py)
  • ripgrep (optional, for quick_search.sh)

Quick start

Direct search (no index):

python scripts/search.py "query" /path/to/docs

With context lines:

python scripts/search.py "query" /path/to/docs --context 3

Limit file types:

python scripts/search.py "query" /path/to/docs --types md,txt

Index-based search (recommended for large repos)

Build the index:

python scripts/indexer.py /path/to/docs --output index.json

Search using the index:

python scripts/search.py "query" --index index.json

Output formats

search.py supports:

  • --format json
  • --format simple
  • --format files

Example:

python scripts/search.py "query" /path/to/docs --format json

Notes

  • Default file types: md, txt, rst, py, js, ts, yaml, yml, json
  • Default excludes: .git, node_modules, __pycache__, .venv, venv
  • Large files (>1MB) are skipped by the indexer

License

No license file is included yet. Add one if you plan to distribute.

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

  • OpenClaw CLI installed and configured.
  • Language: Markdown
  • License: MIT
  • Topics:

FAQ

How do I install doc-search?

Run openclaw add @sanvibyfish/doc-search-skill in your terminal. This installs doc-search into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/sanvibyfish/doc-search-skill. Review commits and README documentation before installing.