Ponens Trace Specification
Version
Version: 1.6
Status: Draft
Format: Canonical typed specification with JSON/Pydantic projection notes
Positioning: Reasoner-agnostic trace specification, with IML / ImandraX as one concrete instantiation
Changes in 1.6 (additive, backward-compatible). Adds Meta-actions (§8.4) — an optional, typed overlay that groups the atomic
actionsinto units of intent (a goal → steps → tool-calls hierarchy), so a trace can be read, reviewed, and graded at the level of what was being attempted rather than only what tool ran. The atomic actions remain the ground-truth record; meta-actions are a producer’s claim about their structure, carrying their own intent, outcome, residuals, and produced artifacts. Existing 1.5 traces remain valid;meta_actionscanonicalizes to the empty list andmeta_action_idis optional.
Changes in 1.5 (additive, backward-compatible). Adds the Residual Surface (§13) — a first-class, typed record of a trace’s negative space: the assumptions it relied on, the claims it left unverified, what it deliberately left out of scope, its known limitations, and the questions it defers to review. This exists to support review, and in particular agent-to-agent handoff, where the consuming agent needs to know where to point rather than re-deriving the whole trace. Existing 1.4 traces remain valid;
residualscanonicalizes to the empty list.
1. Purpose
A trace is the complete formal record of an AI agent’s work session.
It captures not only what the agent did, but also:
- which artifacts existed at each step
- how those artifacts were transformed
- what reasoning justified each consequential action
- which policies were checked over the execution
- how every output can be traced back through its full dependency chain
- how key reasoning outcomes can be independently reviewed and, where supported, reproduced
The trace format is designed to support:
- artifact lineage
- formal reasoning provenance
- policy-checkable execution
- audit and replay
- system-level reasoning over agent workflows
- reasoner-agnostic integration, with concrete engines represented via metadata
This specification treats a trace as more than an activity log. A trace is a typed execution record that can be analyzed, validated, and, in future, formally verified.
2. Canonical Model vs Interchange Model
This specification distinguishes between two layers:
2.1 Canonical model
The canonical model is the semantic source of truth.
It is:
- strongly typed
- algebraic where appropriate
- designed for formal reasoning
- the model against which invariants and semantics are defined
The canonical model should use:
- algebraic variants for closed enums and tagged unions
- typed records for structured objects
- typed constructors for actions and artifacts
- explicit payload types for structured reasoning objects
2.2 Interchange model
The interchange model is a JSON-friendly representation derived from the canonical model.
It exists to support:
- file persistence
- API transport
- UI integration
- Pydantic model generation
- cross-language interoperability
2.3 Design rule
Specify the strongest semantic model first; derive the wire format from it, not the other way around.
The canonical model is authoritative.
JSON and Pydantic schemas are projections of that model.
3. Design Principles
The trace model is built around eight core principles:
-
Typed artifacts
All meaningful objects in the trace are represented explicitly as typed artifacts. -
Producer-consumer lineage
Actions consume input artifacts and produce output artifacts, forming a directed acyclic graph (DAG). -
Explicit reasoning steps
Formalization, reasoning goals, reasoning results, state-space analyses, conformance checks, simulations, and generated tests are first-class trace objects. -
Policy evaluation
Engineering, safety, and governance requirements are represented explicitly and evaluated over the trace. -
Execution semantics
The trace records a sequence of state-changing actions, not just a flat activity history. -
Collaborative and iterative review
Traces may accumulate structured human commentary and may be linked into explicit chains of reruns, fixes, and re-evaluations over time. -
Reasoner-agnostic semantics
Core action and artifact types describe the semantic reasoning operation. Specific engines, model languages, and methods are recorded through metadata. -
Strict internal semantics
The specification prefers typed semantic constructors over loosely typed payload objects, even when the interchange format remains JSON-shaped.
4. Semantic Layer vs Implementation Layer
This specification is intentionally reasoner-agnostic.
4.1 Semantic layer
The core schema uses semantic action and artifact types such as:
FormalizeDefineVerificationGoalVerifyStateSpaceAnalysisConformanceCheckCoSimulateGenerateTests
and artifacts such as:
FormalizationFormalModelVerificationGoalVerificationResultStateSpaceAnalysisResultConformanceResultCoSimulationResultGeneratedTests
These identify what kind of reasoning operation occurred.
4.2 Implementation layer
Concrete engines and methods identify how that operation was realized.
Examples:
tool = "imandrax"format = "iml"model_language = "iml"method = "region_decomposition"
4.3 Example
A trace may record:
- semantic action:
StateSpaceAnalysis - concrete implementation:
tool = "imandrax",method = "region_decomposition"
This means the trace remains portable while preserving the specific capabilities of a concrete reasoner.
5. Canonical Top-Level Structure
The canonical trace is a typed record.
type trace =
{ trace_id : string
; spec_version : string
; assistant : string
; model : string
; timestamp : string
; trigger : event
; actions : action list
; meta_actions : meta_action list
; outcome : event
; artifacts : artifact list
; reference_artifacts : reference_artifact list
; policies : policy list
; policy_evaluations : policy_evaluation list
; execution_environments : execution_environment list
; reproducibility : trace_reproducibility option
; comments : comment list
; review_items : review_item list
; residuals : residual list
; trace_links : trace_link list
; trace_lineage : trace_lineage option
; files_modified : string list
; metrics : metrics option
}
residuals is the trace’s residual surface — its declared negative space (§13).
5.1 Metrics
type metrics =
{ total_actions : int option
; decision_points : int option
; parallel_blocks : int option
; loops : int option
; max_loop_iterations : int option
}
Lists are canonicalized as empty lists rather than omitted values.
6. Events
Events mark the start and end of the process.
6.1 Canonical model
type event_type =
| TaskReceived
| TriggeredByEvent
| ProcessCompleted
| ProcessAborted
| ProcessInterrupted
type event =
{ typ : event_type
; description : string option
; summary : string option
; from_user : string option
; reason : string option
}
6.2 Event semantics
Events are not generic log lines. They delimit the lifecycle of a trace:
- the
triggerevent identifies how the process began - the
outcomeevent identifies how the process ended
7. Artifacts
Artifacts are the core typed objects of the trace.
Actions do not exchange free-form names. They exchange artifact identities.
7.1 Canonical artifact model
The canonical artifact model is strictly typed.
Common artifact fields
type artifact_common =
{ artifact_id : string
; artifact_role : artifact_role option
; name : string option
; format : string option
; revision : int option
; producer_action_id : int option
; derived_from : string list
; supersedes : string option
; content_ref : string option
; summary : string option
; metadata : artifact_metadata option
}
Artifact roles
Artifact roles are semantic tags used by policy evaluation and reasoning.
type artifact_role =
| FormalModelRole
| ApprovedReferenceRole
| ReasoningGoalRole
| ReasoningResultRole
| ProofRole
| CounterexampleRole
| StateSpaceAnalysisRole
| GeneratedTestRole
| AuditEvidenceRole
| CustomArtifactRole of string
Strict artifact type
type artifact =
| UserInstructionArtifact of artifact_common
| SourceCodeArtifact of artifact_common
| DocumentationArtifact of artifact_common
| SearchResultsArtifact of artifact_common
| AnalysisNoteArtifact of artifact_common
| PlanArtifact of artifact_common
| FormalizationArtifact of artifact_common * formalization_payload
| FormalModelArtifact of artifact_common * formal_model_payload
| VerificationGoalArtifact of artifact_common * verification_goal_payload
| VerificationResultArtifact of artifact_common * verification_result_payload
| StateSpaceAnalysisResultArtifact of artifact_common * state_space_analysis_result_payload
| ConformanceResultArtifact of artifact_common * conformance_result_payload
| CoSimulationResultArtifact of artifact_common * cosimulation_result_payload
| GeneratedTestsArtifact of artifact_common * generated_tests_payload
| CommandResultArtifact of artifact_common * command_result_payload
| DiffArtifact of artifact_common
| UserApprovalArtifact of artifact_common
| CommitArtifact of artifact_common
| ReproductionBundleArtifact of artifact_common * reproduction_bundle_payload
7.2 Why strict artifacts
This form is preferred because it ensures:
- payload presence is aligned with artifact kind
- impossible combinations are unrepresentable
- internal reasoning and IML modeling remain precise
7.3 Artifact revisioning
Artifacts are immutable once produced.
To represent evolution:
revisionidentifies the version numberderived_fromidentifies immediate inputssupersedesidentifies a replaced predecessor
8. Actions
Actions are the ordered steps taken by the agent.
8.1 Canonical action model
The canonical action model is also strictly typed.
Common action fields
type action_common =
{ id : int
; label : string
; rationale : string
; detail : string option
; inputs : string list
; outputs : string list
; evidence : evidence list
; observations : observation list
; execution : execution_metadata option
; reproducibility : action_reproducibility option
; meta_action_id : string option (* enclosing meta-action, §8.4 *)
}
Activity actions
type activity_action_type =
| ReadFile
| SearchCode
| SearchWeb
| AnalyzeCode
| ExploreDirectory
| ReadDocumentation
| EditFile
| CreateFile
| DeleteFile
| RenameFile
| RunCommand
| RunTests
| TypeCheck
| Lint
| ManualVerification
| GitStatus
| GitDiff
| GitCommit
| AskUser
| ReportProgress
| Explain
| FormulatePlan
| DecomposeTask
| EstimateImpact
Gateway actions
type gateway_action_type =
| ExclusiveDecision
| ParallelSplit
| EventBasedDecision
| LoopGateway
type decision_option =
{ label : string
; chosen : bool
; rejected_because : string option
; next_action_id : int option
}
type gateway_payload =
{ decision_basis : string option
; supporting_inputs : string list
; options : decision_option list
}
Reasoning actions
type reasoning_action_type =
| Formalize
| DefineVerificationGoal
| Verify
| StateSpaceAnalysis
| ConformanceCheck
| CoSimulate
| GenerateTests
Governance actions
type governance_action_type =
| EvaluatePolicy
| CreateReviewItem
| AcknowledgeReviewItem
| ResolveReviewItem
| AddComment
| RequestApproval
| Approve
| Reject
| CreateSnapshot
| RecordAudit
| LinkTrace
Strict action type
type action =
| ActivityAction of action_common * activity_action_type * action_payload option
| GatewayAction of action_common * gateway_action_type * gateway_payload
| ReasoningAction of action_common * reasoning_action_type * action_payload option
| GovernanceAction of action_common * governance_action_type * action_payload option
8.2 Open request/result fields
Some actions carry implementation-specific request/result structures.
To preserve a strict semantic model, these should remain abstract at the canonical layer:
type action_payload
Implementations may refine action_payload further, or keep it open if needed.
8.3 Execution metadata
type determinism =
| Deterministic
| Mixed
| Nondeterministic
type execution_metadata =
{ tool : string option
; version : string option
; method_ : string option
; determinism : determinism option
; duration_ms : int option
; cost : float option
}
A reasoning action without at least tool should be considered underspecified.
8.4 Meta-actions
The actions list is the trace’s atomic, ground-truth record — one entry per tool call. But work has structure: a session pursues a goal, broken into steps, each carried out by several tool calls. A meta-action captures that structure — a unit of intent that groups the atomic actions which carried it out — so a trace can be read, reviewed, and graded at the level of what was being attempted rather than only what tool ran.
A meta-action is an interpretive overlay, not a replacement. The atomic actions remain the evidence — the unit that reproduce, lineage, and data_flow_integrity operate over; meta-actions are a producer’s claim about how those actions group into intent. This mirrors the positive/negative-space split of §13: the atomic layer is what happened; the meta layer is the structure asserted over it.
Canonical model
type meta_action_status =
| MetaCompleted (* the intent was achieved *)
| MetaPartial (* attempted, not fully achieved *)
| MetaAbandoned (* started, then dropped or superseded *)
type meta_action_source =
| PlanDeclared (* from the agent's own plan / todo list — highest fidelity *)
| TurnSegmented (* inferred from directive (turn) boundaries *)
| IntentInferred (* inferred from a contiguous run of shared intent / rationale *)
type meta_action =
{ id : string
; title : string (* the unit of intent, in plain language *)
; intent : string option (* why — the goal of this step *)
; action_ids : int list (* member atomic actions, in order *)
; outcome : string option (* what resulted *)
; status : meta_action_status option
; source : meta_action_source option (* how the grouping was determined *)
; parent_id : string option (* enclosing meta-action, for multi-level zoom *)
; produced_artifact_ids : string list (* artifacts this step produced *)
; residual_ids : string list (* gaps declared at this level (§13) *)
; tags : string list
}
The trace record (§5) carries meta_actions : meta_action list, canonicalized as the empty list. Each atomic action carries a back-reference meta_action_id : string option (§8.1) to its enclosing meta-action, so navigation works in both directions.
Semantics
- Overlay, ordered, non-overlapping. A meta-action references its members by id; the
actionslist is unchanged. An atomic action belongs to at most one meta-action, and a meta-action’saction_idsare a contiguous, ordered slice of the timeline. Coverage need not be total — incidental actions may remain ungrouped. - Hierarchy via
parent_id. Meta-actions may nest (a goal contains steps contains atomic actions), giving the consumer discrete zoom levels. One level (steps over actions) is the common case;parent_idenables more without changing the model. - Intent + outcome make it reviewable as a unit.
title/intentstate what was being attempted;outcome/statusstate whether it was achieved — so a reviewer can triage a handful of meta-actions (“did each do what it claims?”) before drilling into the atomic actions of the suspect ones. - Gaps and artifacts attach at the level they belong.
residual_idslets an assumption or unverified claim be located at the step (not only at an atomic action viaintroduced_by_action_id, §13.1);produced_artifact_idsmakes the meta-action a coarse node in the lineage DAG (§7).
Source and fidelity
source records how the grouping was determined, because not all groupings are equally trustworthy — it is a fidelity ladder:
PlanDeclared— the boundaries come from the agent’s own declared plan (aFormulatePlan/DecomposeTaskaction, §8.1, or an external todo list). Most authentic: the agent’s stated intent, with its own start/finish markers.TurnSegmented— inferred from directive boundaries (a new instruction begins a new unit).IntentInferred— inferred from a contiguous run of shared rationale/intent, the weakest signal.
A producer should use the highest-fidelity signal available and record which via source, so a consumer knows whether the structure is declared (the agent said so) or inferred (tooling guessed). As with the residual surface, a declared grouping carries more weight across a trust boundary than an inferred one.
Relationship to existing constructs
FormulatePlan/DecomposeTaskactivity actions (§8.1) record the act of planning; a meta-action records the resulting plan unit spanning the actions that executed it. The planning action is typically the producer of aPlanDeclaredmeta-action.- Gateway actions (§8.1) — decisions, splits, loops — sit inside the meta-action whose intent they served, at the point they occurred.
metrics(§5.1) may summarize ameta_action_count;decision_pointsis unchanged — it counts gateways, which meta-actions contain, not replace.
Interchange projection
"meta_actions": [
{
"id": "m2",
"title": "Make the embedded-trace viewer robust to any content",
"intent": "Traces of HTML/JS broke the script parse; embed so no content can corrupt it",
"action_ids": [180, 181, 182, 183],
"outcome": "application/json block + unicode-escaped '<'; renders 389 actions cleanly",
"status": "completed",
"source": "plan_declared",
"produced_artifact_ids": [],
"residual_ids": ["r3"],
"tags": ["viewer"]
}
]
An action that belongs to it back-references it:
{ "id": 181, "label": "Switch the embed to an application/json block", "meta_action_id": "m2" }
9. Evidence and Observations
9.1 Evidence
type evidence_type =
| FileRef
| UrlRef
| CommandOutput
| SearchResult
type evidence =
{ typ : evidence_type
; ref_ : string
; exit_code : int option
}
9.2 Observation
type confidence =
| High
| Medium
| Low
type observation =
{ observation_id : string option
; derived_from : string list
; statement : string
; confidence : confidence option
}
10. Artifact Payload Types
The following structured payloads are canonical.
10.1 Formalization
type formalization_status =
| Transparent
| Opaque
| Failed
type formalization_payload =
{ status : formalization_status
; src_lang : string
; src_code : string
; formal_code : string
; model_language : string option
; symbols : string list
}
10.2 Formal model
type formal_model_payload =
{ model_language : string
; formal_code : string
; symbols : string list
; scope : string option
}
10.3 Verification goals
type verification_goal_kind =
| VerifyGoal
| Instance
| Theorem
| Lemma
| Axiom
type property_status =
| Pending
| Proved
| Refuted
| UnknownProperty
type property_item =
{ name : string
; status : property_status
; src : string
; note : string option
}
type verification_goal_payload =
{ goal_id : int
; goal_revision : int option
; kind : verification_goal_kind
; description : string
; src : string
; target_artifact_id : string
; target_symbol : string option
; properties : property_item list
}
10.4 Verification results
type verification_result_status =
| VrProved
| VrRefuted
| VrSat
| VrUnknown
type sat_model_type =
| InstanceModel
| CounterexampleModel
type sat_model =
{ m_type : sat_model_type
; src : string
}
type verification_result_variant =
| ProvedResult of
{ proof_pp : string
; properties : property_item list
}
| RefutedResult of
{ counterexample : string
}
| SatResult of
{ model : sat_model
}
| UnknownResult of
{ note : string option
}
type verification_result_payload =
{ goal_id : int
; goal_artifact_id : string
; status : verification_result_status
; engine : string option
; completed_at : string option
; result : verification_result_variant
}
10.5 State-space analysis
type witness_model
type region =
{ constraints : string list
; invariant : string option
; model : witness_model option
; model_eval : string option
}
type state_space_analysis_result_payload =
{ target_artifact_id : string
; target_symbol : string option
; analysis_kind : string
; analysis_revision : int option
; description : string option
; complete : bool option
; regions : region list
; coverage_summary : string option
; notes : string option
}
StateSpaceAnalysisResult is generic.
ImandraX region decomposition is one valid instantiation via:
analysis_kind = "region_decomposition"execution.tool = "imandrax"execution.method = "region_decomposition"
10.6 Conformance
type conformance_status =
| ConformancePassed
| ConformanceFailed
| ConformancePartial
| ConformanceUnknown
type conformance_result_payload =
{ reference_artifact_id : string
; target_artifact_id : string
; status : conformance_status
; engine : string option
; findings : string list
; note : string option
}
10.7 Co-simulation
type divergence_point
type cosimulation_status =
| Matched
| Mismatched
| Partial
| ErrorStatus
type cosimulation_result_payload =
{ target_artifact_id : string
; input_artifact_ids : string list
; status : cosimulation_status
; engine : string option
; replayed_steps : int option
; divergence_points : divergence_point list
; summary : string option
; observations : string list
}
10.8 Generated tests
type generated_test =
{ name : string
; region_index : int option
; constraints : string list
; inputs : witness_model option
; expected : witness_model option
; code : string
}
type generated_tests_payload =
{ function_ : string
; language : string
; source_analysis_artifact_id : string
; tests : generated_test list
}
10.9 Command results
type command_result_payload
This may be refined by implementations.
10.10 Reproduction bundle
type reproduction_bundle_payload =
{ entry_action_ids : int list
; artifact_ids : string list
; environment_ids : string list
; notes : string option
}
11. Reference Artifacts
Reference artifacts are approved models, specifications, interfaces, contracts, or domain references used as governance ground truth.
11.1 Canonical model
type reference_artifact_type =
| RefFormalModel
| RefDocumentation
| RefContractModel
| RefProtocolSpec
| RefOther of string
type reference_vg_kind =
| Conformance
| Invariant
| Refinement
type reference_vg =
{ vg_id : string
; description : string
; kind : reference_vg_kind
; src : string
}
type reference_payload
type reference_artifact =
{ reference_artifact_id : string
; name : string
; description : string option
; domain : string
; version : string option
; source : string option
; artifact_type : reference_artifact_type
; format : string option
; content_ref : string option
; payload : reference_payload option
; verification_goals : reference_vg list
}
12. Reproducibility
12.1 Trace-level reproducibility
type reproducibility_status =
| NotReproducible
| PartiallyReproducible
| Reproducible
type trace_reproducibility =
{ status : reproducibility_status
; entrypoints : string list
; required_artifact_ids : string list
; required_environment_ids : string list
; limitations : string list
; notes : string option
}
12.2 Action-level reproducibility
type reproduction_kind =
| DeterministicReplay
| ToolReexecution
| ProceduralReplay
| ManualReproduction
| NotReproducibleKind
type reproduction_procedure_kind =
| CommandProcedure
| WorkflowProcedure
| ReferenceProcedure
| ManualProcedure
type reproduction_procedure =
{ kind : reproduction_procedure_kind
; command : string option
; working_directory : string option
; arguments : string list
; steps : string list
; reference : string option
}
type expected_output =
{ artifact_ids : string list
; result_summary : string option
}
type action_reproducibility =
{ status : reproducibility_status
; reproduction_kind : reproduction_kind
; input_artifact_ids : string list
; environment_id : string option
; procedure : reproduction_procedure
; expected_output : expected_output option
; limitations : string list
; notes : string option
}
12.3 Execution environments
type execution_environment_kind =
| Toolchain
| Container
| Service
| ExternalSystem
type environment_component =
{ name : string
; version : string option
}
type environment_configuration
type execution_environment =
{ environment_id : string
; kind : execution_environment_kind
; name : string
; components : environment_component list
; configuration : environment_configuration option
; notes : string option
}
Consequential reasoning outcomes should either be reproducible directly or linked to a reproducible downstream validation step.
13. Residual Surface
A trace records what the agent established — its actions, artifacts, proofs, and checks. This is the positive space. For review, and especially for agent-to-agent handoff, the consumer also needs the negative space: what the producing agent did not establish.
The residual surface is the explicit, uniform, queryable record of that negative space — the assumptions relied upon, the claims left unverified, the parts left out of scope, the known limitations, and the questions deferred to review.
A positive claim can be checked against the artifacts that back it. The residual surface is what cannot be taken for granted — it tells a reviewer (human or agent) where to point, instead of forcing them to re-derive the whole trace to discover what is missing.
Declaring the residual surface honestly is what makes a trace trustworthy across a trust boundary: a reviewing agent need not assume the trace is complete, because the trace states its own gaps.
The protocol by which a reviewing agent consumes the residual surface — triage by severity, follow
target, runsuggested_check, and hunt undeclared gaps — is defined in the companion Trace Review Handoff Specification (REVIEW_HANDOFF_v0_1.md).
13.1 Canonical model
type residual_kind =
| Assumption (* a premise relied upon but not established within the trace *)
| Unverified (* an action taken or output produced, but not checked or proved *)
| OutOfScope (* deliberately not addressed in this trace *)
| Limitation (* a known constraint under which the established results hold *)
| OpenQuestion (* a decision deferred to a reviewer or human *)
type residual_severity =
| InfoResidual
| LowResidual
| MediumResidual
| HighResidual
| CriticalResidual
type residual_source =
| AgentDeclared (* self-reported by the producing agent *)
| PolicyDerived (* surfaced by a policy evaluation over the trace *)
| ToolInferred (* inferred by analysis tooling *)
| ReviewerAdded (* added during review *)
type residual_status =
| ResidualOpen (* outstanding *)
| ResidualAcknowledged (* seen and accepted as a known gap by a reviewer *)
| ResidualAddressed (* closed, typically by a successor trace *)
| ResidualWaived (* accepted as permanent / not to be addressed *)
type residual =
{ residual_id : string
; kind : residual_kind
; statement : string (* the gap, in plain language *)
; severity : residual_severity option (* impact if wrong or left unaddressed *)
; target : target_ref option (* where it bites (see §14.1) *)
; related_artifact_ids : string list (* affected or supporting artifacts *)
; rationale : string option (* why assumed / not verified / out of scope *)
; suggested_check : string option (* how a reviewer could close it *)
; source : residual_source option
; status : residual_status option
; introduced_by_action_id : int option (* the action that gave rise to it, if any *)
; tags : string list
}
The trace record (§5) carries residuals : residual list, canonicalized as an empty list when there is no declared negative space.
13.2 Semantics
Kinds partition the negative space by why something is unestablished:
Assumption— the work is correct only if this premise holds, and the premise was not checked (e.g. “the upstream API returns results already sorted”).Unverified— something was done but not validated (e.g. a transition that was implemented but has no verification goal).OutOfScope— a deliberate exclusion (e.g. “idempotency under retries was not addressed”).Limitation— a boundary on the established results (e.g. “invariants hold under single-threaded application only”).OpenQuestion— a genuine decision punted to review (e.g. “should a refund reset the approval count?”).
Severity is impact, not probability. severity records how much it matters if the residual is wrong or left unaddressed, independent of how likely that is, so a reviewer can triage by consequence.
target routes attention. Reusing target_ref (§14.1), a residual points at the action, artifact, policy, or trace region where it bites. A reviewing agent navigates by following targets, not by re-reading everything.
suggested_check makes it actionable. Where possible a residual states how it could be closed — the verification goal to add, the test to write, the question to answer — turning the negative space from a warning into a work-list.
Source and trust. source records who surfaced the residual: AgentDeclared is the producing agent’s honest self-report, PolicyDerived is surfaced mechanically (§13.5), ReviewerAdded accrues during review. A trace with no declared residuals is not assumed complete — the absence of residuals is itself reviewable.
13.3 Lifecycle and relationship to review items
A residual is the producer-side declaration of a gap; a review item (§14.3) is the reviewer-side action taken about it. The two are linked but distinct:
- a residual may be promoted to a review item when a reviewer decides it must be tracked or that it blocks approval;
- a residual is typically addressed not by mutating the trace (traces are immutable) but by a successor trace linked via
Supersedes(§15.1), in which the gap no longer appears or is markedResidualAddressed; ResidualWaivedrecords an explicit, auditable decision to accept a gap permanently.
The residual surface therefore shrinks across a chain as successor traces close gaps — mirroring the trace-immutability / append-only-chain model.
13.4 Relationship to existing constructs
The residual surface consolidates and elevates negative-space signals that otherwise remain implicit and scattered:
reproducibility.limitations(§12) are replay-specific; aLimitationresidual is broader (constraints on the result itself). One may surface the other, but they need not coincide.observation.confidence = Low(§9.2) marks a positive statement held with low confidence; such an observation may be promoted to anAssumptionorUnverifiedresidual when a reviewer needs to act on it.property_status = Pending(§10.3),coverage_summary, andcomplete = false(§10.5) are verification-internal. The residual surface should summarize the resulting coverage gap at the trace level — typically anUnverifiedresidual whosetargetis the uncovered symbol and whoserelated_artifact_idsreference the relevant goal or analysis — so a reviewer need not walk every goal to discover what was left unproved.policy_evaluationsexpress requirements checked; afailedornot_applicableevaluation may derive aPolicyDerivedresidual (§13.5).
The rule of thumb: anything a reviewer would otherwise have to infer about what is missing should be stated explicitly in the residual surface.
13.5 Policy hooks
Because the residual surface is typed and queryable, policies (see the Policy Specification) can govern it directly. Illustrative policies for the review handoff:
- no unaddressed critical gaps at approval —
G(Approve → ¬∃ r ∈ residuals . r.severity = Critical ∧ r.status = ResidualOpen) - commits acknowledge their residual surface —
G(GitCommit → ∀ r ∈ residuals . r.severity ≥ High → r.status ≠ ResidualOpen) - high-stakes gaps must say how they close — every
Unverifiedresidual whosetargetlies on ahigh_stakes_pathmust carry a non-emptysuggested_check - assumptions must be locatable — every
Assumptionresidual must carry atargetor non-emptyrelated_artifact_ids
These let an organization require not that traces be gap-free, but that their gaps be declared, located, and triaged — a far more realistic and reviewable bar.
13.6 Interchange projection
A payments trace declaring its negative space:
"residuals": [
{
"residual_id": "r1",
"kind": "limitation",
"statement": "Amount invariants are proved for single-threaded application of transitions only; under concurrent capture/refund the invariant is not established.",
"severity": "high",
"target": { "target_type": "artifact", "target_id": "a8" },
"related_artifact_ids": ["a10"],
"rationale": "The formal model applies one transition at a time; interleavings are not modeled.",
"suggested_check": "Add a concurrency model (or a DB-level lock) and re-verify the amount invariant under interleaved capture/refund.",
"source": "agent_declared",
"status": "open",
"introduced_by_action_id": 22,
"tags": ["concurrency", "payments"]
},
{
"residual_id": "r2",
"kind": "unverified",
"statement": "Dispute and chargeback transitions were not formalized; only 7 of the documented transitions are covered by verification goals.",
"severity": "medium",
"target": { "target_type": "artifact", "target_id": "a17" },
"related_artifact_ids": ["a9"],
"suggested_check": "Add verification goals for the dispute and chargeback transitions.",
"source": "agent_declared",
"status": "open",
"tags": ["coverage"]
},
{
"residual_id": "r3",
"kind": "open_question",
"statement": "Should a refund reset approval_count for a subsequent re-capture? The model currently leaves prior approvals intact.",
"severity": "low",
"target": { "target_type": "artifact", "target_id": "a8" },
"source": "agent_declared",
"status": "open",
"tags": ["product-decision"]
}
]
Following §16.1, the discriminators kind, severity, source, and status serialize as lowercase snake_case strings, and target reuses the target_ref projection.
14. Comments and Review Items
14.1 Common target reference
type target_type =
| TraceTarget
| ActionTarget
| ArtifactTarget
| PolicyTarget
| PolicyEvaluationTarget
| ReferenceArtifactTarget
type target_ref =
{ target_type : target_type
; target_id : string option
}
14.2 Comments
type comment_status =
| Open
| Resolved
type comment =
{ comment_id : string
; author : string
; created_at : string
; body : string
; target : target_ref
; thread_parent_id : string option
; status : comment_status option
; resolved_at : string option
; tags : string list
}
14.3 Review items
type review_item_status =
| ReviewOpen
| Acknowledged
| ReviewResolved
| Waived
type review_item =
{ review_item_id : string
; author : string
; created_at : string
; title : string
; body : string option
; target : target_ref
; assignee : string option
; status : review_item_status
; blocking : bool option
; acknowledged_at : string option
; acknowledged_by : string option
; resolved_at : string option
; resolved_by : string option
; resolution_note : string option
; tags : string list
}
15. Trace Links and Lineage
15.1 Trace links
type trace_relationship =
| Supersedes
| Reruns
| DerivedFrom
| SameTask
| SamePr
| PolicyRecheckOf
| ConformanceRecheckOf
| ForkedFrom
| RelatedTo
type trace_link =
{ link_id : string
; from_trace_id : string
; to_trace_id : string
; relationship : trace_relationship
; created_at : string option
; created_by : string option
; note : string option
}
15.2 Trace lineage summary
type chain_status =
| Active
| Superseded
| Archived
type trace_lineage =
{ parent_trace_id : string option
; root_trace_id : string option
; chain_position : int option
; chain_status : chain_status option
; latest_descendant_trace_id : string option
}
Traces are immutable; iteration is represented by linked successor traces.
16. Interchange Projection Notes
The canonical model above is authoritative.
For interchange, a JSON/Pydantic projection may be derived as follows.
16.1 General rule
Each algebraic variant must be serialized using an explicit discriminator.
Examples:
- artifacts use
artifact_type - actions use
categoryandtype - result variants use explicit tags such as
proved,refuted,sat,unknown
16.2 Artifact projection
A strict artifact constructor such as:
FormalModelArtifact (common, payload)
may be projected to JSON as:
{
"artifact_id": "...",
"artifact_type": "FormalModel",
"artifact_role": "formal_model",
"name": "...",
"format": "iml",
"revision": 1,
"producer_action_id": 2,
"derived_from": ["a1"],
"supersedes": null,
"content_ref": null,
"summary": "...",
"payload": {
"model_language": "iml",
"formal_code": "...",
"symbols": ["..."],
"scope": null
},
"metadata": null
}
16.3 Action projection
A strict action such as:
ReasoningAction (common, Verify, payload)
may be projected to JSON as:
{
"id": 4,
"category": "reasoning",
"type": "Verify",
"label": "...",
"rationale": "...",
"detail": null,
"inputs": ["a3"],
"outputs": ["a4"],
"request": { "...": "..." },
"result": { "...": "..." },
"result_summary": "...",
"evidence": [],
"observations": [],
"execution": {
"tool": "imandrax",
"version": "2026.04",
"method": "model_check",
"determinism": "deterministic",
"duration_ms": 1840,
"cost": null
},
"reproducibility": null
}
16.4 Pydantic generation notes
Reference Pydantic models should be generated as discriminated unions.
Recommended approach:
- one
BaseModelper strict constructor family - discriminator fields such as
artifact_type,category, andtype - decoding must reconstruct the strict canonical model
- invalid discriminator/payload combinations must be rejected during deserialization
16.5 Serialization boundary rule
JSON and Pydantic are interchange formats. The strict typed model remains the semantic source of truth.
17. Recommended Implementation Strategy
17.1 Internal model
Use the strict canonical types from this specification in:
- OCaml
- IML
- reasoning-oriented internal tooling
17.2 Serialization layer
Generate:
- JSON schema
- Pydantic models
- encoders/decoders
from the strict model, not the reverse.
17.3 Validation strategy
Validation should occur at two levels:
-
wire validation
ensure JSON/Pydantic payloads are structurally well-formed -
semantic decoding
ensure the payload can inhabit the canonical strict model