VAL 10 — Rollback Reliability Validation

Purpose

This plan validates that the autonomy rollback command surface (preview + execute) is reliable across the supported rollout strategies, surfaces correct operator diagnostics, and emits the expected audit events. It establishes a measurable rollback success rate against the workplan target of ≥99%.


Claims Under Test

ID

Claim

VAL10-C1

rollback preview exits 0 for all four target kinds and produces safety class, trigger conditions, and known limitations

VAL10-C2

rollback execute strategy=retry on real rollout plans succeeds with 100% success rate across a batch of 5 executions

VAL10-C3

rollback execute strategy=rollback on real rollout plans succeeds with 100% success rate across a batch of 5 executions

VAL10-C4

Aggregate rollback success rate across all 10 executions is ≥ 99%; this slice’s own retained rollback.executed success events are captured


Branch-Specific Rule

Question

Answer

Covered by existing lab?

Partially. run_rollback_orchestrator_lab() covers preview and error paths (nonexistent plan, relay_deadletter not-orchestrated). It does not have success-path tests for rollout_plan execute or a structured pass/fail report.

Lab to extend

scripts/labs/run_cli_audit_lab.sh — new function run_rollback_reliability_val10_lab()

Why new function?

The existing rollback lab at port 18091 is torn down mid-lab; its rollback-db has unknown accumulated state. A fresh isolated CP at port 18995 is required for deterministic success-path tests.

New runner required?

No. Extending run_cli_audit_lab.sh.


Scenario Matrix

VAL10-01 — Preview All Targets Exit 0

Action: Run autonomy rollback preview --target <kind> for all four kinds (rollout_plan, rollout_stage, ha_leader_resign, relay_deadletter). Evidence: val10-preview-rollout_plan.txt, val10-preview-rollout_stage.txt, val10-preview-ha_leader_resign.txt, val10-preview-relay_deadletter.txt Pass criterion: All 4 commands exit 0. preview_errors=0. Note: Preview is read-only. No control-plane connection required.


VAL10-02 — Preview rollout_plan JSON Schema

Action: autonomy rollback preview --target rollout_plan --output json. Evidence: val10-preview-rollout_plan.json, val10-preview-rollout_plan-check.txt Pass criterion: safety_class=terminal, orchestrated=true, valid_strategies contains both retry and rollback.


VAL10-03 — Preview relay_deadletter JSON Schema

Action: autonomy rollback preview --target relay_deadletter --output json. Evidence: val10-preview-relay_deadletter.json, val10-preview-relay-check.txt Pass criterion: orchestrated=false, manual_path contains edgectl.


VAL10-04 — Execute retry: Batch Success Rate

Action: Create 5 plans (val10-retry-1 through val10-retry-5); run autonomy rollback execute --target rollout_plan --strategy retry for each. Evidence: val10-retry-plans-created.txt, retry/execute-retry-{1..5}.txt, val10-retry-rate.txt Pass criterion: ok=5, fail=0, success_rate=1.000 (100% ≥ 99%). Expected output: executed  target=rollout_plan  resource=val10-retry-N outcome=success  previous=published  new=active


VAL10-05 — Execute rollback: Batch Success Rate

Action: Create 5 plans (val10-rollback-1 through val10-rollback-5); run autonomy rollback execute --target rollout_plan --strategy rollback for each. Evidence: val10-rollback-plans-created.txt, rollback/execute-rollback-{1..5}.txt, val10-rollback-rate.txt Pass criterion: ok=5, fail=0, success_rate=1.000 (100% ≥ 99%). Expected output: executed  target=rollout_plan  resource=val10-rollback-N outcome=success  previous=published  new=rolled_back


VAL10-06 — Execute JSON Output Shape

Action: Re-execute retry on val10-retry-1 (already in active phase, so retry is idempotent) with --output json. Evidence: val10-execute-json.json, val10-execute-json-check.txt Pass criterion: Response JSON has non-empty Outcome (or outcome), NewState (or new_state), and Kind (or kind) fields. Note: Field names are Go struct exported names rendered by json.MarshalIndent; the check handles both CamelCase and snake_case variants.


VAL10-07 — Execute Error: Nonexistent Plan

Action: rollback execute --target rollout_plan --strategy retry --resource val10-nonexistent-plan. Evidence: val10-execute-nonexistent.txt, val10-nonexistent-check.txt Pass criterion: Command exits non-zero. Expected message: rollback execute: ...not found (HTTP 404 from CP).


VAL10-08 — Execute Error: relay_deadletter Not Orchestrated

Action: rollback execute --target relay_deadletter --resource seg-1/peer-1. Evidence: val10-execute-relay-not-orchestrated.txt, val10-relay-not-orchestrated-check.txt Pass criterion: Command exits non-zero and output contains edgectl instructions. Expected message: includes edgectl relay deadletter retry|purge instructions.


VAL10-09 — Audit: rollback.preview.requested Events

Action: autonomy audit query --event-type rollback.preview.requested against the retained audit store, scoped to this slice’s actor and start time. Evidence: val10-audit-preview-events.json, val10-audit-preview-check.txt Pass criterion: count 4 for actor val10-preview-op with timestamp >= val10_start_time.


VAL10-10 — Audit: rollback.executed Success Events + Aggregate Rate

Action: autonomy audit query --event-type rollback.executed --outcome success scoped to this slice’s actor and start time; compute aggregate rate from VAL10-04 + VAL10-05 results. Evidence: val10-audit-execute-events.json, val10-aggregate-rate.txt Pass criterion:

  • agg_success_rate 0.990 (workplan target: ≥99%)

  • At least 10 retained rollback.executed success events from this slice’s batch executes


Harness Plan

Tools

Tool

Purpose

autonomy rollback preview

Read-only safety profile verification

autonomy rollback execute

Dispatches to POST /v1/rollouts/{id}/recover

curl (via _val10_create helper)

Create test plans via raw API

python3 -c

JSON field extraction from preview/execute JSON output

autonomy audit query --output json

Verify actor-scoped, time-scoped audit events from retained store

Control-Plane Setup

Resource

Value

Port

127.0.0.1:18995

Data dir

$WORK_DIR/val10 (removed and recreated before each run)

RBAC

AUTONOMY_RBAC_ENFORCEMENT=0

Operator identity

AUTONOMY_OPERATOR=val10-preview-op for preview, val10-test-op for execute (audit attribution)

Plan Lifecycle

Plans

Phase at creation

Strategy

Expected new_phase

val10-retry-1..5

published

retry

active

val10-rollback-1..5

published

rollback

rolled_back

Both strategies operate on published plans:

  • retry: recoverRetry — checks not-terminal, not-paused → calls UpdatePhase(active) + refreshes updated_at

  • rollback: recoverRollback — checks not-terminal → calls RollbackPlan() → terminal phase

Success Rate Measurement

retry_rate     = retry_ok     / (retry_ok     + retry_fail)
rollback_rate  = rollback_ok  / (rollback_ok  + rollback_fail)
aggregate_rate = (retry_ok + rollback_ok) / (retry_total + rollback_total)
target: aggregate_rate ≥ 0.990

With 10 clean plan creates and no external interference, the expected result is aggregate_rate = 1.000.


Known Failure Modes

Mode

Description

Detectable by

CP start failure

Port conflict or binary missing

val10-health.txt status=unreachable

Plan create failure

RBAC blocking or duplicate ID

val10-retry-plans-created.txt non-201 codes

Strategy=retry on terminal

Plan already in terminal phase

execute-retry-N.txt contains precondition error

Missing --strategy flag

CLI validation rejects before dispatch

Exit code check; error message in output

Audit query empty

Audit store not populated for this slice

val10-audit-preview-check.txt count < 4 or val10-aggregate-rate.txt success_events < 10

Out-of-Scope Items

Item

Reason

rollout_stage skip_failed

Requires stage_in_progress phase with open stage — not reachable by simple plan creation without running the batch promoter

ha_leader_resign via VAL10

Covered by existing run_rollback_orchestrator_lab() against HA helper server

Automatic trigger rollback

TriggerAutomatic path not yet implemented in control-plane

30-day soak (≥100 plans)

Scope of workplan GA gate; not covered by CLI lab

PostgreSQL backend rollback

Requires live PG instance


Evidence Files

File

Description

val10-cp.log

CP startup and per-request logs

val10-health.txt

CP health check result

val10-preview-rollout_plan.txt

Text preview output

val10-preview-rollout_stage.txt

Text preview output

val10-preview-ha_leader_resign.txt

Text preview output

val10-preview-relay_deadletter.txt

Text preview output

val10-preview-check.txt

preview_errors=0, pass

val10-preview-rollout_plan.json

JSON preview for rollout_plan

val10-preview-rollout_plan-check.txt

safety_class, orchestrated, pass

val10-preview-relay_deadletter.json

JSON preview for relay_deadletter

val10-preview-relay-check.txt

orchestrated=false, manual_path_has_edgectl, pass

val10-retry-plans-created.txt

5 plan create results (HTTP 201)

retry/execute-retry-{1..5}.txt

Per-plan retry execute stdout/stderr

val10-retry-rate.txt

strategy=retry ok=5 fail=0 rate=1.000 pass=true

val10-rollback-plans-created.txt

5 plan create results (HTTP 201)

rollback/execute-rollback-{1..5}.txt

Per-plan rollback execute stdout/stderr

val10-rollback-rate.txt

strategy=rollback ok=5 fail=0 rate=1.000 pass=true

val10-execute-json.json

JSON output for retry execute (VAL10-06)

val10-execute-json-check.txt

outcome, new_state, kind, pass

val10-execute-nonexistent.txt

Error output for nonexistent plan

val10-nonexistent-check.txt

exit_code non-zero, pass

val10-execute-relay-not-orchestrated.txt

Error + edgectl instructions

val10-relay-not-orchestrated-check.txt

exit_code non-zero, has_edgectl_instructions=1, pass

val10-audit-preview-events.json

rollback.preview.requested events from audit store

val10-audit-preview-check.txt

count 4 scoped by actor + start_time, pass

val10-audit-execute-events.json

rollback.executed events from audit store

val10-aggregate-rate.txt

agg_ok, agg_total, agg_success_rate, retained success-event count, actor/start_time scope, pass

val10-report.txt

Human-readable composite report (10 checks + rate table)

val10-report.json

Machine-readable JSON with success_rate object and per-check statuses


Pass/Fail Criteria

Full pass: All 10 checks report PASS.

Minimum acceptable: VAL10-04, VAL10-05, VAL10-10 pass — success path for both strategies at ≥99% rate.

Key thresholds:

Check

Threshold

VAL10-04 (retry rate)

ok=5 of 5, rate=1.000

VAL10-05 (rollback rate)

ok=5 of 5, rate=1.000

VAL10-07 (nonexistent error)

exit code ≠ 0

VAL10-08 (relay error)

exit code ≠ 0 + edgectl in output

VAL10-09 (preview audit)

count 4

VAL10-10 (aggregate rate)

rate 0.990 + success_events 10