VAL29 — AutonomyOps v1 Public-Claim Evidence Matrix¶

Audience: founders, engineering leads, product managers, and external reviewers making ship/no-ship or claim-level decisions for AutonomyOps v1.

VAL29 is a meta-aggregator, not a test runner. It reads the four proof-report JSON artifacts produced by VAL25–VAL28 and produces a single capability-level evidence matrix with honest, per-claim readiness assessments. Nothing is soft-pedalled. BETA claims are labelled explicitly. Gaps are stated precisely.

1. Scope¶

VAL29 reads from:

Source	Contents
`val25/val25-proof-report.json`	Fleet rollout proof (VAL07–VAL11)
`val26/val26-proof-report.json`	HA control-plane proof (VAL13–VAL17)
`val27/val27-proof-report.json`	Edge relay proof (VAL19–VAL23, optional VAL24)
`val28/val28-proof-report.json`	Cross-cutting proof (VAL01–VAL06)

VAL29 does not re-run any tests. It is idempotent and read-only.

Before VAL29 may emit DESIGN PARTNER READY, the four proof reports must also fall within a single 7-day evidence campaign window and a disclosure artifact must exist at val29/design-partner-disclosures.json inside the cli-audit-lab evidence directory.

What VAL29 covers¶

The full v1 claim set across four capability groups:

Group	Claims
Fleet Rollouts	Latency, throughput, stuck detection, rollback, chaos, soak, PG backend
HA Control Plane	Failover, zero data loss, replication lag, backup/restore, split-brain, quorum, soak
Edge Relay	Outage, impairment, throughput, queue/overflow, deadletter, bandwidth, soak, multi-peer
Cross-Cutting	Cert rotation, trust-chain, RBAC, audit, OTel, support bundle, PG audit, external audit

What VAL29 does NOT cover¶

OS Reconstruction (hardware-gated, out of scope for current suite)
Multi-architecture container validation (Gate E, hardware-gated)
Native riscv64 hardware CI (Gate E, hardware-gated)
Any item requiring satellite/cellular connectivity hardware

2. Evidence State Definitions¶

State	Meaning
`VALIDATED`	Claim fully supported by completed VAL runs with all checks passing
`BETA`	Claim supported but with documented limitations; must be disclosed to users
`NOT_STARTED`	Framework / tooling exists; Gate D not yet run (30-day soaks)
`DEFER`	Not validated; additional engineering or hardware required
`FUTURE_REQUIRED`	Required for Public Production Claim; beyond current VAL scope

3. Recommendation Definitions¶

Recommendation	Meaning
`OK_DESIGN_PARTNER`	Safe to include in design partner ship as-is
`BETA_ONLY`	May ship to design partners with explicit written disclosure
`NOT_STARTED`	Cannot ship until Gate D passes (30-day soaks)
`DEFER`	Must not claim until gap closed; omit from marketing until then
`FUTURE_REQUIRED`	Must not claim until third-party evidence obtained

4. Evidence Matrix¶

Fleet Rollouts¶

ID	Claim	Evidence	State	Recommendation
FR-PERF-01	Plan creation latency p99 ≤ 500 ms	VAL07	VALIDATED	OK_DESIGN_PARTNER
FR-THRU-01	N=100 concurrent device rollouts; zero errors	VAL08	VALIDATED	OK_DESIGN_PARTNER
FR-RECV-01	Stuck rollout detection + recovery (retry/rollback)	VAL09	VALIDATED	OK_DESIGN_PARTNER
FR-RECV-02	Rollback success rate ≥ 99% (aggregate over 10 plans)	VAL10	VALIDATED	OK_DESIGN_PARTNER
FR-CHOS-01	Fleet chaos resilience: CP restart, kill cycles, corrupt artifacts	VAL11	VALIDATED	OK_DESIGN_PARTNER
FR-SOAK-01	30-day fleet soak: ≥ 100 plans, rollback rate ≥ 0.990 (Gate D)	VAL12	NOT_STARTED	NOT_STARTED
FR-INFR-01	PostgreSQL backend: CP runs on PG with full validation	None	DEFER	DEFER
FR-PERF-02	Throughput recalibrated on production-representative hardware	None	DEFER	DEFER

Key limitations:

FR-CHOS-01: SIGTERM only — SIGKILL kill and iptables chaos not tested.
FR-SOAK-01: VAL12 framework is complete; Gate D requires a 30-day continuous run.
FR-INFR-01 / FR-PERF-02: All fleet VALs run against SQLite on a single host. PG backend and production hardware must be validated before a GA claim.

HA Control Plane¶

ID	Claim	Evidence	State	Recommendation
HA-FAIL-01	Leader failover ≤ 5,000 ms: SIGTERM, SIGKILL, 3× rapid cycles	VAL13	VALIDATED	OK_DESIGN_PARTNER
HA-FAIL-02	Zero data loss across leader failover	VAL13	VALIDATED	OK_DESIGN_PARTNER
HA-REPL-01	Replication lag distribution; derived alerting thresholds	VAL14	BETA	BETA_ONLY
HA-BKUP-01	Backup/restore: timing ≤ 30 s / 60 s, SHA-256 integrity, correctness	VAL15	VALIDATED	OK_DESIGN_PARTNER
HA-SBRC-01	Split-brain detection (epoch divergence) + manual recovery	VAL16	VALIDATED	OK_DESIGN_PARTNER
HA-QRMO-01	Quorum loss: detected ≤ 30,000 ms, writes blocked, recovery	VAL17	VALIDATED	OK_DESIGN_PARTNER
HA-FAIL-03	Streaming-replication promotion (standby → primary, real PG HA)	None	DEFER	DEFER
HA-SOAK-01	30-day HA soak: ≥ 3 failovers, failover_ms ≤ 10,000, continuity = 1.0	VAL18	NOT_STARTED	NOT_STARTED
HA-CHOS-01	Real network partition chaos (iptables / tc)	None	DEFER	DEFER

Key limitations:

HA-REPL-01: Alerting thresholds (healthy/degraded/alert ms) are derived from Docker write_lag measurements. Docker disk I/O is materially faster than cloud VMs. These thresholds must be recalibrated against production write_lag observations before any alerting deployment. Treat as informational for design partner.
HA-SBRC-01: SQL metadata injection only — no real network partitions. Automatic split-brain recovery is out of scope (manual promote-leader only).
HA-QRMO-01: docker stop/start only — not iptables; write-gate verified via status fields (no /v1/rollouts endpoint on this binary).
HA-FAIL-03 / HA-CHOS-01: Required for GA claim.

Edge Relay¶

ID	Claim	Evidence	State	Recommendation
RL-IMPW-01	Outage handling: all segments → DEADLETTER within max_retry exhaustion	VAL19	VALIDATED	OK_DESIGN_PARTNER
RL-IMPW-02	Bandwidth impairment (1/10 Mbps): delivery confirmed, throughput informational	VAL19	BETA	BETA_ONLY
RL-IMPW-03	Latency impairment (200/500 ms): delivery confirmed within dial/ack timeouts	VAL19	VALIDATED	OK_DESIGN_PARTNER
RL-THRU-01	Throughput: 5 tiers T1–T5, zero loss, queue monotonicity	VAL20	BETA	BETA_ONLY
RL-QMGM-01	Queue drain (200×64 B), LRU eviction accounting, relay-status accuracy	VAL21	VALIDATED	OK_DESIGN_PARTNER
RL-DEAD-01	Deadletter: retry → delivery (Group R = 1.000), BoltDB retention, purge	VAL22	VALIDATED	OK_DESIGN_PARTNER
RL-BAND-01	Bandwidth management: unlimited/rate-only/quota-only/hot-reload; S-E unit test	VAL23	BETA	BETA_ONLY
RL-SOAK-01	30-day relay soak: rounds ≥ 1,440, clean_delivery_rate ≥ 0.990, loss = 0	VAL24	NOT_STARTED / VALIDATED	NOT_STARTED / OK_DESIGN_PARTNER
RL-MULT-01	Multi-peer relay: delivery isolation between ≥ 2 concurrent peers	None	DEFER	DEFER
RL-CONN-01	Contested-connectivity (satellite, cellular, WAN loss > 5%)	None	DEFER	DEFER
RL-CRAS-01	BoltDB crash consistency on unclean power failure	None	DEFER	FUTURE_REQUIRED

Key limitations:

RL-IMPW-02: No hard throughput SLA under bandwidth impairment. Figures are informational from proxy stats (last_conn_bytes / last_conn_ms). SLA-grade throughput targets under impairment have not been defined or measured.
RL-THRU-01: segs/sec and bytes/sec figures are from single-host Docker runs only. No production hardware baseline. No hard throughput floor is set — zero-loss correctness is the only validated property.
RL-BAND-01: S-E (daily quota reset) validated by injected-clock unit test only. A live 24-hour run has not been performed. This must be resolved (live run or formal exception approval) before the bandwidth management claim can leave beta.
RL-SOAK-01: State is driven by the direct soak_val24.gate_d_overall signal from VAL27. Until Gate D passes, the row remains NOT_STARTED.
RL-CONN-01: Must be explicitly disclosed as a gap if shipping to design partners who expect satellite/cellular support.
RL-CRAS-01: Power-failure crash consistency is required for Public Production Claim.

Cross-Cutting¶

ID	Claim	Evidence	State	Recommendation
XC-CERT-01	Zero-downtime cert rotation: detection, mTLS continuity, timing ≤ 300 s, audit	VAL01	VALIDATED	OK_DESIGN_PARTNER
XC-CERT-02	Trust-chain rejection: missing, invalid chain, expired, revoked, wrong server	VAL02	VALIDATED	OK_DESIGN_PARTNER
XC-RBAC-01	RBAC enforcement: 5 DENY + 5 ALLOW + 3 NOT_GUARDED + 1 AUDIT check	VAL03	VALIDATED¹	OK_DESIGN_PARTNER
XC-AUDT-01	Audit completeness: 25 event types, 6 categories, schema, latency ≤ 2,000 ms	VAL04	VALIDATED	OK_DESIGN_PARTNER
XC-OTEL-01	OTel: Prometheus /metrics, WAL pipeline, OTLP flush, trace ID propagation	VAL05	VALIDATED	OK_DESIGN_PARTNER
XC-BNDL-01	Support bundle: archive ≤ 30 s, 6 collectors, secrets redacted, degraded mode	VAL06	VALIDATED	OK_DESIGN_PARTNER
XC-AUDT-02	PG-backed audit store: query performance under load	None	DEFER	DEFER
XC-OTEL-02	OTel pipeline validated against production-grade OTLP collector	None	DEFER	DEFER
XC-SCRT-01	External security audit of cert management and RBAC surfaces	None	FUTURE_REQUIRED	FUTURE_REQUIRED
XC-COMP-01	Compliance audit of audit completeness (SOC 2, etc.)	None	FUTURE_REQUIRED	FUTURE_REQUIRED

¹ VAL03 checks can be SKIP (not FAIL) when the HA server is unavailable. If any checks were skipped, re-run with HA server available to confirm full 14-check coverage before a GA claim.

Key limitations:

XC-CERT-01: CRL is loaded at CP start; runtime cert revocation requires CP restart. Tested with SQLite-backed CP only.
XC-OTEL-01: OTLP sink is a local test server (127.0.0.1:14318), not a production collector. Metrics use prometheus/client_golang (not OTel SDK metrics).
XC-BNDL-01: Tested with synthetic secrets (known deadbeef salt and val06-secret-pass password). Production secret scanning completeness not independently audited.
XC-AUDT-02: The --pg-url audit path was not load-tested; SQLite is the only audited backend. Required before GA.
XC-SCRT-01 / XC-COMP-01: Third-party evidence required for Public Production Claim.

5. Readiness Levels¶

Design Partner Ready¶

All four proof reports (VAL25/VAL26/VAL27/VAL28) must confirm design_partner: true, the proof reports must fall within one 7-day evidence campaign window, and the following BETA disclosures must be recorded in val29/design-partner-disclosures.json and made in writing to each design partner:

Replication lag alerting thresholds (HA-REPL-01): Derived from Docker measurements. Must be recalibrated against production write_lag before alerting deployment.
Relay throughput figures (RL-THRU-01): Single-host Docker only. No production hardware calibration. No hard throughput SLA.
Relay bandwidth impairment throughput (RL-IMPW-02): Informational proxy stats only. No SLA-grade target defined.
Relay daily quota reset (RL-BAND-01): Validated by injected-clock unit test only. Live 24-hour run not performed.
Relay 30-day soak (RL-SOAK-01): Not yet completed. Reliability claims for long-running deployments are provisional until Gate D passes.
Contested-connectivity (RL-CONN-01): Not validated. Satellite/cellular connectivity is out of scope for all current relay VALs.

GA Ready¶

Design Partner PLUS all of:

VAL12 Gate D (fleet 30-day soak): rollback_success_rate ≥ 0.990
VAL18 Gate D (HA 30-day soak): failovers ≥ 3, failover_ms ≤ 10,000, data_continuity_rate = 1.0, ha_uptime_pct ≥ 99.9
VAL24 Gate D (relay 30-day soak): rounds ≥ 1,440, clean_delivery_rate ≥ 0.990, loss = 0
Multi-peer relay validation (at least basic 2-peer delivery + isolation)
VAL20 throughput recalibrated on production-representative hardware
VAL14 alerting thresholds recalibrated against production write_lag
VAL23 S-E daily reset: live 24-hour run OR formally approved exception
VAL03 full 14-check coverage with HA server (no SKIPs)
Streaming-replication promotion failover (HA-FAIL-03)
PG-backed audit store query performance (XC-AUDT-02)

Public Production Claim¶

GA Ready PLUS:

BoltDB crash consistency on unclean power failure (RL-CRAS-01)
Multi-peer relay soaks with message isolation proof
Production-grade observability: queue depth alerting, deadletter paging, bandwidth quota depletion notification
Real network partition chaos for HA and relay (HA-CHOS-01)
External security audit (XC-SCRT-01)
Penetration testing of mTLS trust-chain boundaries
Compliance audit of audit event completeness (XC-COMP-01)
Production-hardware throughput benchmarks for fleet rollouts and relay

6. Run the Matrix¶

Prerequisites¶

Run all four proof report generators first:

# Fleet, HA, and cross-cutting (share a cli-audit-lab evidence dir)
bash scripts/labs/run_fleet_rollout_proof_report_val25.sh \
  evidence/cli-audit-lab-YYYY-MM-DD
bash scripts/labs/run_ha_proof_report_val26.sh \
  evidence/cli-audit-lab-YYYY-MM-DD
bash scripts/labs/run_crosscut_proof_report_val28.sh \
  evidence/cli-audit-lab-YYYY-MM-DD

# Relay (uses auto-discovered standalone evidence dirs)
bash scripts/labs/run_relay_proof_report_val27.sh evidence/

Generate the evidence matrix¶

Create the disclosure artifact first:

mkdir -p evidence/cli-audit-lab-YYYY-MM-DD/val29
cat > evidence/cli-audit-lab-YYYY-MM-DD/val29/design-partner-disclosures.json <<'EOF'
{
  "written_disclosures": [
    "ha_replication_lag_thresholds_docker_derived",
    "relay_throughput_single_host_only",
    "relay_impairment_throughput_informational_only",
    "relay_daily_quota_reset_unit_test_only",
    "relay_soak_reliability_provisional_until_gate_d",
    "relay_contested_connectivity_not_validated"
  ]
}
EOF

Then run VAL29:

bash scripts/labs/run_evidence_matrix_val29.sh \
  evidence/cli-audit-lab-YYYY-MM-DD \
  evidence/

Output files¶

File	Contents
stdout	Evidence matrix report
`val29/val29-evidence-matrix.txt`	Same content as stdout
`val29/val29-evidence-matrix.json`	Machine-readable JSON with full matrix

The JSON artifact contains the full matrix array plus:

readiness
evidence_campaign
design_partner_disclosures

7. Tooling¶

File	Role
`scripts/labs/run_evidence_matrix_val29.sh`	VAL29 evidence matrix generator
`scripts/labs/run_fleet_rollout_proof_report_val25.sh`	VAL25 fleet rollout proof (input)
`scripts/labs/run_ha_proof_report_val26.sh`	VAL26 HA proof (input)
`scripts/labs/run_relay_proof_report_val27.sh`	VAL27 relay proof (input)
`scripts/labs/run_crosscut_proof_report_val28.sh`	VAL28 cross-cutting proof (input)
`docs/tutorials/fleet-rollout-proof-report-validation.md`	VAL25 formal plan
`docs/tutorials/ha-proof-report-validation.md`	VAL26 formal plan
`docs/tutorials/relay-proof-report-validation.md`	VAL27 formal plan
`docs/tutorials/crosscut-proof-report-validation.md`	VAL28 formal plan
`docs/tutorials/evidence-matrix-validation.md`	This document