← All specifications

Agentic AI Runtime Governance — ponens Policy Pack

This pack maps the FIX AI Working Group’s proposed runtime-governance scheme onto computable ponens policies. It turns the proposal’s traffic-light (Green / Amber / Red) testing scheme into a set of formulas that ponens trace check evaluates deterministically over an agent’s execution trace — which is exactly the property the proposal requires of any binding control.

Sources

Why this maps onto ponens

The proposal states the design requirement directly (§10.3):

“a governance decision that cannot be expressed as a deterministic function over the governed state and measurable signals, executing in bounded time and independently of the language model, is advisory guidance, not binding enforcement.”

A ponens policy is a deterministic function over a trace that returns a verdict. So the correspondence is structural, not analogical:

FIX proposalponens
Agent execution record (identity, intent, tool calls, approvals, telemetry)the trace
Traffic-light condition (per domain)a policy (temporal / structural formula)
GREEN / AMBER / REDverdict pass / warning-fail / error-fail
GovernanceState fieldthe aggregate of the pack over a trace
Decision priority ordering (§6.3)severity + exit-code aggregation
Four governance tiers (Assistive → Critical Autonomous)pack tier profiles
Szpruch capability failure modes (C1–C4)individual safety policies

ponens trace check already produces the aggregation: PASS = Green, WARN (a warning-severity fail) = Amber, FAIL (an error-severity fail) = Red, and a non-zero exit code = “this trace is not Green” — i.e. the GovernanceState the proposal wants carried in a new FIX field.

Trace model

Governance facts appear in the trace as:

Worked traces: examples/agentic_governance/governed.json (all 21 Green) and violating.json (8 Red + 2 Amber). Run ponens trace check <file>.

The pack

error severity ⇒ Red (halt / containment); warning severity ⇒ Amber (flag / refer). Bounds (Lmax, tool budget) are shown at illustrative values and are set per deployment / DCE.

1. Identity & Authorisation (security)

PolicyFormulaRAGTier
agent_identity_resolvedG(action → agent_id_resolved ∧ kya_valid)R1–4
legal_entity_vlei_presentG(action → vlei_present)R1–4
dce_current_for_consequentialG(ToolCall ∨ Release ∨ Deploy → dce_current)R2–4
credential_not_expiringG(action → ¬credential_expiring)A1–4

2. Intent & Constraint (conformance)

PolicyFormulaRAGTier
execution_linked_to_intentG(ToolCall ∨ Release ∨ Deploy → intent_resolved)R1–4
within_constraint_scopeG(action → within_constraint_scope)R2–4
policy_reference_currentG(action → policy_current)R2–4

3. Capability & DCE (safety)

PolicyFormulaRAGTier
tool_calls_allowlistedG(ToolCall → in_allowlist)R2–4
consequential_action_approvedG(Release ∨ Deploy → P(UserApproval ∧ authenticated))R2–4
dual_approval_criticalG(Release → P(UserApproval ∧ approver_1) ∧ P(UserApproval ∧ approver_2))R4
default_deny_confirmedG(ToolCall → P(default_deny_confirmed))R4

4. Runtime Telemetry & Trajectory (auditability)

PolicyFormulaRAGTier
telemetry_spans_complete∀ s ∈ telemetry . s.status = recordedR2–4
no_guard_violationG(¬guard_violation)R2–4
trajectory_within_boundcount(action) ≤ 50R2–4
tool_call_budgetcount(ToolCall) ≤ 20A2–4
no_prohibited_transitionG(¬prohibited_transition)R3–4

5. Approval & Release Gating (workflow)

PolicyFormulaRAGTier
no_release_without_authenticated_approvalG(Release ∨ Deploy → P(UserApproval ∧ authenticated ∧ approval_scope_covers))R2–4
decision_path_reconstructableG(Release ∨ Deploy → decision_path_present)A2–4

Capability failure modes — Szpruch C1–C3 (reasoning)

PolicyCapabilityFormulaRAG
retrieved_data_attributedC1 Retrieval & AttributionG(Retrieve → provenance_checked ∧ recency_checked)R
numeric_recomputed_deterministicallyC2 Deterministic Numeric ComputationG(Compute → deterministic_recompute)R
outputs_policy_constrainedC3 Policy-Constrained DraftingG(Draft → template_compliant)R

(C4 Gated Release & Dispatch is no_release_without_authenticated_approval.)

GovernanceState aggregation

The proposal’s decision priority ordering (§6.3) collapses to severity + first-match over the pack:

RED and AMBER are never collapsed: an Amber trace still carries a complete pass set for human resolution; a Red trace names the failed error policies (the GovernanceFlags the FIX field would carry).

Tier profiles

Each policy is tagged tier-<range>. A deployment selects the subset for its governance tier (Szpruch four-tier model):

TierProfile
1 Assistiveidentity + intent + C1/C3 (Amber-tolerant)
2 Bounded Workflow+ capability allowlist, approval gate, telemetry, release gating
3 High-Impact Governed+ prohibited-transition, policy-as-code
4 Critical Autonomousfull set incl. dual_approval_critical, default_deny_confirmed (Red enforced as a hard block)

Language extension

This pack motivated one new operator in the ponens policy language, the aggregate count(φ) <op> N — the number of trace positions at which φ holds — used by trajectory_within_bound (Lmax guard) and tool_call_budget. It is implemented in both evaluators (CLI + browser playground) and covered by the cross-evaluator parity harness.

Out of scope (proposal Gap 5)

Per-trace policies cannot characterise population-level / emergent behaviour. The proposal itself flags these as Gap 5 / future work, and they are deliberately excluded here: