VAL29 — AutonomyOps v1 Public-Claim Evidence Matrix

Audience: founders, engineering leads, product managers, and external reviewers making ship/no-ship or claim-level decisions for AutonomyOps v1.

VAL29 is a meta-aggregator, not a test runner. It reads the four proof-report JSON artifacts produced by VAL25–VAL28 and produces a single capability-level evidence matrix with honest, per-claim readiness assessments. Nothing is soft-pedalled. BETA claims are labelled explicitly. Gaps are stated precisely.

1. Scope

VAL29 reads from:

Source

Contents

val25/val25-proof-report.json

Fleet rollout proof (VAL07–VAL11)

val26/val26-proof-report.json

HA control-plane proof (VAL13–VAL17)

val27/val27-proof-report.json

Edge relay proof (VAL19–VAL23, optional VAL24)

val28/val28-proof-report.json

Cross-cutting proof (VAL01–VAL06)

VAL29 does not re-run any tests. It is idempotent and read-only.

Before VAL29 may emit DESIGN PARTNER READY, the four proof reports must also fall within a single 7-day evidence campaign window and a disclosure artifact must exist at val29/design-partner-disclosures.json inside the cli-audit-lab evidence directory.

What VAL29 covers

The full v1 claim set across four capability groups:

Group

Claims

Fleet Rollouts

Latency, throughput, stuck detection, rollback, chaos, soak, PG backend

HA Control Plane

Failover, zero data loss, replication lag, backup/restore, split-brain, quorum, soak

Edge Relay

Outage, impairment, throughput, queue/overflow, deadletter, bandwidth, soak, multi-peer

Cross-Cutting

Cert rotation, trust-chain, RBAC, audit, OTel, support bundle, PG audit, external audit

What VAL29 does NOT cover

  • OS Reconstruction (hardware-gated, out of scope for current suite)

  • Multi-architecture container validation (Gate E, hardware-gated)

  • Native riscv64 hardware CI (Gate E, hardware-gated)

  • Any item requiring satellite/cellular connectivity hardware

2. Evidence State Definitions

State

Meaning

VALIDATED

Claim fully supported by completed VAL runs with all checks passing

BETA

Claim supported but with documented limitations; must be disclosed to users

NOT_STARTED

Framework / tooling exists; Gate D not yet run (30-day soaks)

DEFER

Not validated; additional engineering or hardware required

FUTURE_REQUIRED

Required for Public Production Claim; beyond current VAL scope

3. Recommendation Definitions

Recommendation

Meaning

OK_DESIGN_PARTNER

Safe to include in design partner ship as-is

BETA_ONLY

May ship to design partners with explicit written disclosure

NOT_STARTED

Cannot ship until Gate D passes (30-day soaks)

DEFER

Must not claim until gap closed; omit from marketing until then

FUTURE_REQUIRED

Must not claim until third-party evidence obtained

4. Evidence Matrix

Fleet Rollouts

ID

Claim

Evidence

State

Recommendation

FR-PERF-01

Plan creation latency p99 ≤ 500 ms

VAL07

VALIDATED

OK_DESIGN_PARTNER

FR-THRU-01

N=100 concurrent device rollouts; zero errors

VAL08

VALIDATED

OK_DESIGN_PARTNER

FR-RECV-01

Stuck rollout detection + recovery (retry/rollback)

VAL09

VALIDATED

OK_DESIGN_PARTNER

FR-RECV-02

Rollback success rate ≥ 99% (aggregate over 10 plans)

VAL10

VALIDATED

OK_DESIGN_PARTNER

FR-CHOS-01

Fleet chaos resilience: CP restart, kill cycles, corrupt artifacts

VAL11

VALIDATED

OK_DESIGN_PARTNER

FR-SOAK-01

30-day fleet soak: ≥ 100 plans, rollback rate ≥ 0.990 (Gate D)

VAL12

NOT_STARTED

NOT_STARTED

FR-INFR-01

PostgreSQL backend: CP runs on PG with full validation

None

DEFER

DEFER

FR-PERF-02

Throughput recalibrated on production-representative hardware

None

DEFER

DEFER

Key limitations:

  • FR-CHOS-01: SIGTERM only — SIGKILL kill and iptables chaos not tested.

  • FR-SOAK-01: VAL12 framework is complete; Gate D requires a 30-day continuous run.

  • FR-INFR-01 / FR-PERF-02: All fleet VALs run against SQLite on a single host. PG backend and production hardware must be validated before a GA claim.

HA Control Plane

ID

Claim

Evidence

State

Recommendation

HA-FAIL-01

Leader failover ≤ 5,000 ms: SIGTERM, SIGKILL, 3× rapid cycles

VAL13

VALIDATED

OK_DESIGN_PARTNER

HA-FAIL-02

Zero data loss across leader failover

VAL13

VALIDATED

OK_DESIGN_PARTNER

HA-REPL-01

Replication lag distribution; derived alerting thresholds

VAL14

BETA

BETA_ONLY

HA-BKUP-01

Backup/restore: timing ≤ 30 s / 60 s, SHA-256 integrity, correctness

VAL15

VALIDATED

OK_DESIGN_PARTNER

HA-SBRC-01

Split-brain detection (epoch divergence) + manual recovery

VAL16

VALIDATED

OK_DESIGN_PARTNER

HA-QRMO-01

Quorum loss: detected ≤ 30,000 ms, writes blocked, recovery

VAL17

VALIDATED

OK_DESIGN_PARTNER

HA-FAIL-03

Streaming-replication promotion (standby → primary, real PG HA)

None

DEFER

DEFER

HA-SOAK-01

30-day HA soak: ≥ 3 failovers, failover_ms ≤ 10,000, continuity = 1.0

VAL18

NOT_STARTED

NOT_STARTED

HA-CHOS-01

Real network partition chaos (iptables / tc)

None

DEFER

DEFER

Key limitations:

  • HA-REPL-01: Alerting thresholds (healthy/degraded/alert ms) are derived from Docker write_lag measurements. Docker disk I/O is materially faster than cloud VMs. These thresholds must be recalibrated against production write_lag observations before any alerting deployment. Treat as informational for design partner.

  • HA-SBRC-01: SQL metadata injection only — no real network partitions. Automatic split-brain recovery is out of scope (manual promote-leader only).

  • HA-QRMO-01: docker stop/start only — not iptables; write-gate verified via status fields (no /v1/rollouts endpoint on this binary).

  • HA-FAIL-03 / HA-CHOS-01: Required for GA claim.

Edge Relay

ID

Claim

Evidence

State

Recommendation

RL-IMPW-01

Outage handling: all segments → DEADLETTER within max_retry exhaustion

VAL19

VALIDATED

OK_DESIGN_PARTNER

RL-IMPW-02

Bandwidth impairment (1/10 Mbps): delivery confirmed, throughput informational

VAL19

BETA

BETA_ONLY

RL-IMPW-03

Latency impairment (200/500 ms): delivery confirmed within dial/ack timeouts

VAL19

VALIDATED

OK_DESIGN_PARTNER

RL-THRU-01

Throughput: 5 tiers T1–T5, zero loss, queue monotonicity

VAL20

BETA

BETA_ONLY

RL-QMGM-01

Queue drain (200×64 B), LRU eviction accounting, relay-status accuracy

VAL21

VALIDATED

OK_DESIGN_PARTNER

RL-DEAD-01

Deadletter: retry → delivery (Group R = 1.000), BoltDB retention, purge

VAL22

VALIDATED

OK_DESIGN_PARTNER

RL-BAND-01

Bandwidth management: unlimited/rate-only/quota-only/hot-reload; S-E unit test

VAL23

BETA

BETA_ONLY

RL-SOAK-01

30-day relay soak: rounds ≥ 1,440, clean_delivery_rate ≥ 0.990, loss = 0

VAL24

NOT_STARTED / VALIDATED

NOT_STARTED / OK_DESIGN_PARTNER

RL-MULT-01

Multi-peer relay: delivery isolation between ≥ 2 concurrent peers

None

DEFER

DEFER

RL-CONN-01

Contested-connectivity (satellite, cellular, WAN loss > 5%)

None

DEFER

DEFER

RL-CRAS-01

BoltDB crash consistency on unclean power failure

None

DEFER

FUTURE_REQUIRED

Key limitations:

  • RL-IMPW-02: No hard throughput SLA under bandwidth impairment. Figures are informational from proxy stats (last_conn_bytes / last_conn_ms). SLA-grade throughput targets under impairment have not been defined or measured.

  • RL-THRU-01: segs/sec and bytes/sec figures are from single-host Docker runs only. No production hardware baseline. No hard throughput floor is set — zero-loss correctness is the only validated property.

  • RL-BAND-01: S-E (daily quota reset) validated by injected-clock unit test only. A live 24-hour run has not been performed. This must be resolved (live run or formal exception approval) before the bandwidth management claim can leave beta.

  • RL-SOAK-01: State is driven by the direct soak_val24.gate_d_overall signal from VAL27. Until Gate D passes, the row remains NOT_STARTED.

  • RL-CONN-01: Must be explicitly disclosed as a gap if shipping to design partners who expect satellite/cellular support.

  • RL-CRAS-01: Power-failure crash consistency is required for Public Production Claim.

Cross-Cutting

ID

Claim

Evidence

State

Recommendation

XC-CERT-01

Zero-downtime cert rotation: detection, mTLS continuity, timing ≤ 300 s, audit

VAL01

VALIDATED

OK_DESIGN_PARTNER

XC-CERT-02

Trust-chain rejection: missing, invalid chain, expired, revoked, wrong server

VAL02

VALIDATED

OK_DESIGN_PARTNER

XC-RBAC-01

RBAC enforcement: 5 DENY + 5 ALLOW + 3 NOT_GUARDED + 1 AUDIT check

VAL03

VALIDATED¹

OK_DESIGN_PARTNER

XC-AUDT-01

Audit completeness: 25 event types, 6 categories, schema, latency ≤ 2,000 ms

VAL04

VALIDATED

OK_DESIGN_PARTNER

XC-OTEL-01

OTel: Prometheus /metrics, WAL pipeline, OTLP flush, trace ID propagation

VAL05

VALIDATED

OK_DESIGN_PARTNER

XC-BNDL-01

Support bundle: archive ≤ 30 s, 6 collectors, secrets redacted, degraded mode

VAL06

VALIDATED

OK_DESIGN_PARTNER

XC-AUDT-02

PG-backed audit store: query performance under load

None

DEFER

DEFER

XC-OTEL-02

OTel pipeline validated against production-grade OTLP collector

None

DEFER

DEFER

XC-SCRT-01

External security audit of cert management and RBAC surfaces

None

FUTURE_REQUIRED

FUTURE_REQUIRED

XC-COMP-01

Compliance audit of audit completeness (SOC 2, etc.)

None

FUTURE_REQUIRED

FUTURE_REQUIRED

¹ VAL03 checks can be SKIP (not FAIL) when the HA server is unavailable. If any checks were skipped, re-run with HA server available to confirm full 14-check coverage before a GA claim.

Key limitations:

  • XC-CERT-01: CRL is loaded at CP start; runtime cert revocation requires CP restart. Tested with SQLite-backed CP only.

  • XC-OTEL-01: OTLP sink is a local test server (127.0.0.1:14318), not a production collector. Metrics use prometheus/client_golang (not OTel SDK metrics).

  • XC-BNDL-01: Tested with synthetic secrets (known deadbeef salt and val06-secret-pass password). Production secret scanning completeness not independently audited.

  • XC-AUDT-02: The --pg-url audit path was not load-tested; SQLite is the only audited backend. Required before GA.

  • XC-SCRT-01 / XC-COMP-01: Third-party evidence required for Public Production Claim.

5. Readiness Levels

Design Partner Ready

All four proof reports (VAL25/VAL26/VAL27/VAL28) must confirm design_partner: true, the proof reports must fall within one 7-day evidence campaign window, and the following BETA disclosures must be recorded in val29/design-partner-disclosures.json and made in writing to each design partner:

  1. Replication lag alerting thresholds (HA-REPL-01): Derived from Docker measurements. Must be recalibrated against production write_lag before alerting deployment.

  2. Relay throughput figures (RL-THRU-01): Single-host Docker only. No production hardware calibration. No hard throughput SLA.

  3. Relay bandwidth impairment throughput (RL-IMPW-02): Informational proxy stats only. No SLA-grade target defined.

  4. Relay daily quota reset (RL-BAND-01): Validated by injected-clock unit test only. Live 24-hour run not performed.

  5. Relay 30-day soak (RL-SOAK-01): Not yet completed. Reliability claims for long-running deployments are provisional until Gate D passes.

  6. Contested-connectivity (RL-CONN-01): Not validated. Satellite/cellular connectivity is out of scope for all current relay VALs.

GA Ready

Design Partner PLUS all of:

  1. VAL12 Gate D (fleet 30-day soak): rollback_success_rate ≥ 0.990

  2. VAL18 Gate D (HA 30-day soak): failovers ≥ 3, failover_ms ≤ 10,000, data_continuity_rate = 1.0, ha_uptime_pct ≥ 99.9

  3. VAL24 Gate D (relay 30-day soak): rounds ≥ 1,440, clean_delivery_rate ≥ 0.990, loss = 0

  4. Multi-peer relay validation (at least basic 2-peer delivery + isolation)

  5. VAL20 throughput recalibrated on production-representative hardware

  6. VAL14 alerting thresholds recalibrated against production write_lag

  7. VAL23 S-E daily reset: live 24-hour run OR formally approved exception

  8. VAL03 full 14-check coverage with HA server (no SKIPs)

  9. Streaming-replication promotion failover (HA-FAIL-03)

  10. PG-backed audit store query performance (XC-AUDT-02)

Public Production Claim

GA Ready PLUS:

  1. BoltDB crash consistency on unclean power failure (RL-CRAS-01)

  2. Multi-peer relay soaks with message isolation proof

  3. Production-grade observability: queue depth alerting, deadletter paging, bandwidth quota depletion notification

  4. Real network partition chaos for HA and relay (HA-CHOS-01)

  5. External security audit (XC-SCRT-01)

  6. Penetration testing of mTLS trust-chain boundaries

  7. Compliance audit of audit event completeness (XC-COMP-01)

  8. Production-hardware throughput benchmarks for fleet rollouts and relay

6. Run the Matrix

Prerequisites

Run all four proof report generators first:

# Fleet, HA, and cross-cutting (share a cli-audit-lab evidence dir)
bash scripts/labs/run_fleet_rollout_proof_report_val25.sh \
  evidence/cli-audit-lab-YYYY-MM-DD
bash scripts/labs/run_ha_proof_report_val26.sh \
  evidence/cli-audit-lab-YYYY-MM-DD
bash scripts/labs/run_crosscut_proof_report_val28.sh \
  evidence/cli-audit-lab-YYYY-MM-DD

# Relay (uses auto-discovered standalone evidence dirs)
bash scripts/labs/run_relay_proof_report_val27.sh evidence/

Generate the evidence matrix

Create the disclosure artifact first:

mkdir -p evidence/cli-audit-lab-YYYY-MM-DD/val29
cat > evidence/cli-audit-lab-YYYY-MM-DD/val29/design-partner-disclosures.json <<'EOF'
{
  "written_disclosures": [
    "ha_replication_lag_thresholds_docker_derived",
    "relay_throughput_single_host_only",
    "relay_impairment_throughput_informational_only",
    "relay_daily_quota_reset_unit_test_only",
    "relay_soak_reliability_provisional_until_gate_d",
    "relay_contested_connectivity_not_validated"
  ]
}
EOF

Then run VAL29:

bash scripts/labs/run_evidence_matrix_val29.sh \
  evidence/cli-audit-lab-YYYY-MM-DD \
  evidence/

Output files

File

Contents

stdout

Evidence matrix report

val29/val29-evidence-matrix.txt

Same content as stdout

val29/val29-evidence-matrix.json

Machine-readable JSON with full matrix

The JSON artifact contains the full matrix array plus:

  • readiness

  • evidence_campaign

  • design_partner_disclosures

7. Tooling

File

Role

scripts/labs/run_evidence_matrix_val29.sh

VAL29 evidence matrix generator

scripts/labs/run_fleet_rollout_proof_report_val25.sh

VAL25 fleet rollout proof (input)

scripts/labs/run_ha_proof_report_val26.sh

VAL26 HA proof (input)

scripts/labs/run_relay_proof_report_val27.sh

VAL27 relay proof (input)

scripts/labs/run_crosscut_proof_report_val28.sh

VAL28 cross-cutting proof (input)

docs/tutorials/fleet-rollout-proof-report-validation.md

VAL25 formal plan

docs/tutorials/ha-proof-report-validation.md

VAL26 formal plan

docs/tutorials/relay-proof-report-validation.md

VAL27 formal plan

docs/tutorials/crosscut-proof-report-validation.md

VAL28 formal plan

docs/tutorials/evidence-matrix-validation.md

This document