Natural-language harness directory
Agent Workspace
Browse verified harness packs with source links, evaluation guidance, and starter instructions for Claude Code, Codex, Cursor, and Convex.
Directory
3 packs
Injection Surface Audit
Every agent product ships injection surfaces. Audit them before an attacker does.
A systematic prompt-injection surface audit for agent-backed products. Stanford's March 2026 meta-harness study found roughly 1-in-4 community-authored skills contained a practical injection vector. This pack translates that finding into an actionable checklist: tool allow-lists, URL validation and SSRF guards, sanitization of content fetched from external sources, signed skill manifests, scoped permissions, and a review cadence. Target auditor: a senior engineer who owns an agent in production and has one afternoon to harden it.
Not yet measured
~0 installs·~0 tokens saved
Seven Safety Layers
Defense-in-depth for tool execution. Deny > ask > allow. All 7 layers, in order, with honest failure modes.
Security reference for Claude Code's 7-layer permission architecture, derived from the VILA-Lab Dive-into-Claude-Code paper (arXiv 2604.14228). The paper's anchor finding is that Claude Code is ~1.6% AI decision logic and ~98.4% deterministic infrastructure; the safety layers are a large slice of that infrastructure. The pack enumerates the 7 independent layers in order — (1) tool pre-filtering, (2) deny-first rule evaluation, (3) permission mode constraints across 7 modes (plan, default, acceptEdits, auto, dontAsk, bypassPermissions, bubble), (4) auto-mode ML classifier (yoloClassifier.ts two-stage fast-filter + chain-of-thought with a timeout race against a pre-computed classification), (5) shell sandboxing for filesystem and network isolation, (6) non-restoration on resume (permissions re-established each session; trust is not persistent), (7) PreToolUse hooks with permissionDecision return values. It also names the 4-stage authorization pipeline (pre-filter → hooks → rules → handler with 4 branches). The pack carries the paper's three recurring design commitments — graduated layering, append-only auditability, model judgment inside a deterministic harness — and documents the honest failure modes the paper flags: ~50 subcommands bypass shell-layer analysis, the classifier's timeout race can leave decisions ambiguous, and the pre-trust window allowed 4 published CVEs where extensions executed before the trust dialog appeared. Target: a staff engineer or security lead reviewing an agent's tool-execution surface before shipping.
Not yet measured
~0 installs·~0 tokens saved
CVE: The Pre-Trust Execution Window
4 CVEs share one root cause: extensions execute before the trust dialog renders.
A security pack targeting the pre-trust execution window documented in VILA-Lab's Dive into Claude Code (arXiv 2604.14228). Extensions — plugins, MCP servers, hooks, skill manifests — can execute code during CLI initialization *before* the trust dialog appears, creating a structurally privileged attack window that lives outside the deny-first permission pipeline. The paper's README and Safety section tie four patched CVEs to this root cause. This is NOT prompt injection at runtime — it is code execution at load time, with the user's full OS privileges, before the user has been shown what they are trusting. Pair with `injection-surface-audit` for runtime content attacks; use this pack when you are building or auditing a harness that loads third-party extensions. CC-BY-NC-SA-4.0 source terms apply.
Not yet measured
~0 installs·~0 tokens saved
Recent change traces · Cross-linked to packs
What changed, and why
Every coding session captured as rows — Scenario / Files / Changes / Why. Ctrl+F your own history.