← Guides

Capture & curate a trace

You'll end up with a clean, honest reasoning record of an agent session — curated steps, real artifact lineage, and the gaps it left open — ready to grade, govern, and review.

0. The one-command path

If you just want the whole flow run for you, point the agent at:

ponens agent

It prints the full workflow (emit → curate → enrich → grade → govern) and the agent drives it. The rest of this guide is what that command does, step by step.

1. Emit — capture the session

An agent emits the trace from its own session transcript; you write nothing. Emission captures the actions, file lineage, decisions, and reasoning as ground truth.

ponens emit -o trace.json                      # newest session for this project
ponens emit transcript.jsonl -o trace.json --from claude-code

The atomic actions it records are never rewritten — they're the ungameable layer a reviewer can trust. Everything below curates a readable narrative on top of them.

2. Curate the narrative

Emission seeds each step's title from the raw directives ("yes", "ok now fix it"). Those are drafts — rewrite them into a clean account of what was built and why.

ponens trace meta ls trace.json                # the steps, and how curated each is
ponens trace meta set trace.json m3 --title "Add idempotency to capture" --status completed
ponens trace meta merge trace.json m7 m8       # fold a dead-end into one step
ponens trace retitle trace.json --title "…" --outcome "…"

3. Enrich — declare what emission can't derive

Bind the artifacts you produced (so the lineage DAG is real), and declare the residual surface — the negative space only you can name.

ponens trace artifact trace.json --type VerificationResult \
    --name "no double-charge" --producer-action-id 12

ponens trace residual add trace.json --kind assumption --severity high \
    --statement "Assumes the gateway sends a stable idempotency key" \
    --suggested-check "confirm the retry contract"

The residual surface is the single most useful thing you hand a reviewer — it tells them where to look instead of making them guess.

4. Grade it

A quality rubric — structure, rationale coverage, negative space, reproducibility, verification evidence, lineage — plus a separate policy-compliance line.

ponens trace grade trace.json

Treat the grade as a hygiene floor to clear, not a number to game. The real bars are reproduction and review.

5. Check it reproduces

ponens trace reproduce trace.json

Reproducibility is what makes the curated narrative credible instead of vibes — it ties the record to the actual commit.

Next: govern it with policies → or post it on the PR →