Attestation Mode Rollout¶
Audience: operators turning on the runtime attestation gate (#725 PR6) on a fleet
that has been running with AUTONOMY_ATTESTATION_MODE=off. The gate denies tool
execution when a node’s enrollment or rollout state is not coherent with the bundle’s
declared provenance. Flipping straight to enforce without soak time means any
operator misalignment surfaces as production 403s. This runbook is the safe
sequence.
Prerequisites¶
All nodes in the target fleet have been enrolled via
autonomy node enroll --node-id <id> --enrollment-ref <ref>.Bundle authors have started declaring
bundle.Manifest.Provenance.EnrollmentRefon signed bundles (schema v1.2+, see Security Model — Attestation join key).Each runtime has
AUTONOMY_NODE_ID(oridentity.node_idin config) set to the same string used at enrollment.Operator has access to the WAL via
autonomy wal inspectand the orchestrator read APIs viaautonomy attestation status.
Procedure¶
1. Verify enrollment substrate fleet-wide¶
For each node, confirm the enrollment row matches the bundle’s declared binding:
autonomy attestation status --node-id <id>
Expected output: ENROLLMENT_REF populated, STATUS=active,
ROLLOUT_STATE either <none> or one of pending / acknowledged
/ active. A <none> ENROLLMENT means the node was missed —
re-enroll before proceeding.
For CI / scripted verification, drive autonomy attestation eval
against every node + candidate bundle pair:
autonomy bundle pull <ref> /tmp/bundle.tar
autonomy attestation eval --bundle /tmp/bundle.tar --node-id <id>
Non-zero exit on deny means the fleet is not ready for enforce.
Fix the misalignment before flipping the mode.
2. Flip to advisory¶
On every runtime:
AUTONOMY_ATTESTATION_MODE=advisory <restart runtime process>
Advisory mode runs the gate but never returns 403. Every would-be
deny is written to the WAL as a second autonomy.decision frame
under the same audit_id as the policy-layer allow.
3. Soak¶
Let traffic run for at least one full directive cycle (typically 24h). Scrape advisory denies from the WAL:
autonomy wal inspect --kind autonomy.decision \
| jq 'select(.attrs.reason | startswith("attestation:"))'
Expected results, in order of severity:
Reason prefix |
Action |
|---|---|
|
Investigate why the node was revoked; re-enroll if intentional |
|
The bundle’s binding doesn’t match the enrollment row — verify the operator typed the same string at both ends, re-enroll if needed |
|
Node is in a rollback or failed state; resolve the rollout incident before flipping enforce |
Do not flip enforce while ANY non-zero advisory denies are appearing unless you intentionally want those denies to start returning 403.
4. Flip to enforce¶
When the advisory denies are zero (or only on nodes you’ve intentionally revoked), flip:
AUTONOMY_ATTESTATION_MODE=enforce <restart runtime process>
The runtime now returns 403 on any sub-check failure. The wire
response carries the canonical reason prefix
(attestation: <sub-check>) so client code can branch on it.
5. Verify the layered WAL¶
Pick a known-good audit_id (any tool call that succeeded post-
enforce) and confirm the dual-emit shape:
autonomy audit get <audit_id>
Expected: outcome=allow, reason from the policy layer. No
attestation frame means the gate didn’t object — the happy path.
For a deny path, force a known mismatch (revoke a test node, attempt a request) and confirm:
autonomy audit get <audit_id>
Expected: outcome=deny, reason starts with attestation:.
Recovery¶
If enforce surfaces unexpected denies in production:
Immediate: flip back to
advisoryand restart. Traffic resumes; the WAL still records the would-be denies for investigation.Diagnose:
autonomy attestation statusper affected node; confirm enrollment + rollout state match expectations.Re-test: re-run
autonomy attestation eval --bundle <new>against each node before flipping enforce again.