Agent Workspace

Production-readyevaluator-optimizer

Claude Advisor Pattern v2

Sonnet executes, Opus advises. Route intelligence by confidence, pay Opus only when it matters.

Evaluator-optimizer harness built on the Anthropic Advisor Pattern. Claude Sonnet 4.6 runs the task; on low-confidence decisions (threshold = 0.7), after two consecutive tool failures, or when the executor explicitly escalates, it calls Claude Opus 4.6 as an advisor. The advisor returns a structured recommendation; Sonnet applies it and continues. Cost split is measured — typical deployments spend 8-12% of tokens on Opus while retaining ~93% of an Opus-only pass-rate.

Not yet measured

~0 installs·~0 tokens saved

harnessevaluator-optimizeradvisor

Agent Workspace2026-04-17

Recommendedorchestrator-workers

Turn Execution Pipeline

The 9-step loop, 5 context shapers, 3 recovery paths. The 98.4% infrastructure under every turn.

Canonical harness reference for Claude Code's turn-execution pipeline, derived from the VILA-Lab Dive-into-Claude-Code paper (arXiv 2604.14228). The paper's anchor finding is that Claude Code is ~1.6% AI decision logic and ~98.4% deterministic infrastructure; this pack documents the infrastructure. It names the 9 steps (settings resolution, state initialization, context assembly, five pre-model shapers, model call, tool dispatch, permission gate, tool execution, stop-condition check), the 5 pre-model context shapers in cheapest-first order (Budget Reduction, Snip, Microcompact, Context Collapse, Auto-Compact) with their triggers, and the 3 recovery mechanisms (max output token escalation up to 3 retries, reactive compaction firing at most once per turn, prompt-too-long overflow chain: context-collapse → reactive compaction → terminate). The pack maps directly onto the paper's three recurring design commitments: graduated layering over monolithic mechanisms, append-only designs favoring auditability over query power, and model judgment within a deterministic harness. Use it to implement, clone, or critique any production agent loop. Target: a staff engineer rebuilding the loop in a different language (Rust, Python, Bun) or auditing an existing one for compaction-race and stop-condition bugs.

Not yet measured

~0 installs·~0 tokens saved

harnessturn-loopcontext-compaction

Recommendedparallelization

Subagent Delegation — Three Isolation Modes

SkillTool vs AgentTool, 6 built-in types, 3 isolation modes. The knob every harness engineer gets wrong first.

Harness pack covering Claude Code's subagent architecture derived from the VILA-Lab architectural analysis (arXiv 2604.14228). Frames the SkillTool vs AgentTool trade-off as the central decision: SkillTool injects instructions into the current context (cheap, same window); AgentTool spawns a fresh isolated conversation (~7x tokens, context-safe). Documents the 6 built-in subagent types (Explore, Plan, General-purpose, Claude Code Guide, Verification, Statusline-setup), custom `.claude/agents/*.md` with YAML frontmatter (tools, model, permissionMode, hooks, skills, memory scope), the 3 isolation modes (worktree / remote / in-process — default), sidechain transcripts as separate JSONL files, and multi-instance coordination via POSIX flock() with zero external dependencies.

Not yet measured

~0 installs·~0 tokens saved

harnessparallelizationsubagents

harnesspersistencesession

Recommended

Session Persistence — Three Channels

Append-only JSONL across 3 channels. Permissions never restore on resume — the friction IS the safety.

Harness pack covering Claude Code's session persistence design derived from the VILA-Lab architectural analysis (arXiv 2604.14228). Documents the three persistence channels: append-only session JSONL transcripts (full conversation with chain-patched compaction boundaries), global `history.jsonl` for cross-session prompt recall (reverse-read for Up-arrow), and subagent sidechains as separate JSONL per subagent. Frames the critical deliberate non-feature: permissions are never restored on resume — trust is always re-established in the current session. The paper presents this as a design choice, not a UX bug: the user friction is the cost of maintaining the safety invariant. The pack also captures the append-only / chain-patching trade-off — auditability and simplicity over query power.

Not yet measured

~0 installs·~0 tokens saved

referencearchitecturedesign-space

Communityreference

Recommended

Agent Design Space — Six Decisions

The architect's framework: answer these six before picking a pattern.

A reference pack that walks builders through the six recurring design decisions every production coding agent must answer: reasoning placement, safety posture, context management, extensibility, subagent architecture, and session persistence. Each decision pairs Claude Code's answer with realistic alternatives from LangGraph, Devin, SWE-Agent, OpenHands, and Aider, then closes with a Meta-Pattern of three commitments (graduated layering, append-only, model-judgment + deterministic harness). Frame this as 'the pack about picking packs' — answer all six first, then let the catalog serve specific patterns that match your posture. Sourced from VILA-Lab/Dive-into-Claude-Code build-your-own-agent.md (arXiv 2604.14228, CC-BY-NC-SA-4.0).

Not yet measured

~0 installs·~0 tokens saved