Answer Review and Quality Checks
Persist the answer as a reviewable artifact, not just a message string.
Install
One-line install
npx attrition-sh pack install answer-review-and-quality-checks
AGENTS.md snippet (Claude Code / Cursor)
# Answer Review and Quality Checks # See: /packs/answer-review-and-quality-checks
Raw Markdown
Machine-readable body for agent ingestion or copy/paste.
Telemetry
Not yet measuredSummary
A runtime pattern that persists final answers, quality checks, evaluation metadata, and downstream review state as first-class records instead of burying them in message text.
Fit and expected payoff
When this pack earns its extra structure, when to skip it, and what it should improve.
Use when
Situations where this pack earns its extra structure.
- Answers may need post-run review, escalation, or auditing.
- You want live evaluation to exercise the same runtime as the product.
- The app needs a deploy gate or quality dashboard.
Avoid when
Keeps the pack from becoming a default hammer.
- The output is disposable and does not need later inspection.
- You are still in the earliest sketch phase and do not yet know the domain rubric.
What it improves
Expected outcomes if implemented well.
- Quality becomes measurable and replayable.
- The answer contract is separated from the chat transcript.
- Runtime checks and eval checks can share a common packet shape.
Minimal instructions
Smallest useful starting point.
Persist a final answer packet for every meaningful run. The packet should include: - final answer - scope and references - quality checks - trace pointers - evaluation linkage Do not rely on chat text alone as the system of record.
Full instructions
Complete natural-language instruction set.
Treat the answer as a durable application artifact. For every non-trivial run: 1. Persist the final answer to an answer packet. 2. Attach references, quality checks, and trace metadata. 3. Link eval runs and live scoring back to the same packet shape. 4. Surface packet status in the UI so operators can see whether the answer passed or failed checks. This should support: - replay - review - later analytics - quality gating before deployment
Evaluation checklist
These checks should pass before you consider the pattern production-ready.
- Is there a persisted answer packet for each completed assistant run?
- Do runtime checks and eval checks share a visible schema?
- Can a later reviewer inspect packet quality without reading the raw message transcript?
Common failure modes
Every check below traces back to a specific production failure. Read as: "I would think about X because in production Y can happen."
- Mid
Quality checks exist only in logs, not in app data.
- Trigger
- (legacy — trigger not separated)
- Prevention
- (legacy — no explicit prevention)
- Mid
The final answer is stored as unstructured chat text only.
- Trigger
- (legacy — trigger not separated)
- Prevention
- (legacy — no explicit prevention)
- Mid
Evaluation runs test a different runtime than the product uses.
- Trigger
- (legacy — trigger not separated)
- Prevention
- (legacy — no explicit prevention)