Tutorial 05 — Portability: Run Everywhere (amd64, arm64, riscv64)¶
Objective: Demonstrate that the edge runtime runs deterministically across three CPU architectures (amd64, arm64, riscv64) and two filesystem types (ext4, xfs) — with a single binary, no CGO, and no platform-specific configuration. The portability matrix is the formal evidence artefact for release certification.
What you will demonstrate:
Understand the portability claim and its limits (what is implemented vs. roadmap)
Run the native-arch portability matrix (4 cells: no QEMU required)
Understand the full 6-cell matrix (Docker + QEMU for cross-arch)
Inspect the evidence artefacts produced by each cell
Read the CGO-free dependency analysis
Run the portability CI gate (strict exit-1 mode)
Time: ~15 minutes (native-only, no QEMU)
Portability Matrix¶
The matrix has 6 cells: 3 architectures × 2 filesystems.
ext4 xfs
amd64 [P] PASS [P] PASS
arm64 [P] PASS [P] PASS
riscv64 [P] PASS [P] PASS
Status key:
[P] PASS — all 4 steps passed
[F] FAIL — at least one step failed
[S] SKIP — arch/FS unavailable (non-native without QEMU, or xfs requires root)
Native (amd64): All cells run directly — no Docker, no QEMU, no root required for ext4. xfs requires a loop device (root) or a host xfs mount.
Cross-arch (arm64, riscv64): Cells run inside Docker containers with QEMU via
binfmt_misc. If Docker or QEMU registration is absent the cell is marked SKIP,
not FAIL. The CI gate does not require SKIP cells to pass.
Evidence:
scripts/portability/core_matrix.sh,Makefile:portability-matrix,Makefile:portability-matrix-full
The CGO-Free Contract¶
All edge packages compile with CGO_ENABLED=0. The dependency tree contains no
C library calls:
Package |
CGO needed? |
Notes |
|---|---|---|
|
No |
Pure Go; uses mmap + |
|
No |
Pure Go (assembly acceleration optional, not required) |
|
No |
Pure Go syscall wrappers; |
|
No |
Pure Go HTTP + metric registry |
|
No |
Pure Go CLI |
|
No |
Pure Go YAML parser |
No #cgo directives appear in any imported package. Cross-compilation is a
GOARCH=<target> go build ./... command — no cross-toolchain, no sysroot.
Evidence:
edge/go.mod,edge/ci/scan_prohibited/main.go(INV-10 import ban),edge/ci/scan_dependencies/main.go
What Each Cell Tests¶
Each matrix cell runs four steps in sequence. A single step failure fails the cell.
Step 1 — Go Unit Tests¶
On the native arch, runs:
telemetry/module: WAL durability, safe-point recovery, OTLP drainlock/module: BLAKE3 fingerprint stability, canonical serialisation
On cross-arch cells (inside QEMU Docker image), unit tests are run if Docker/QEMU is available; otherwise this step logs a note and the other 3 steps proceed.
Step 2 — Randomised Crash Harness¶
TestCrashHarness_Randomized (telemetry/crash_harness_test.go) runs ITERATIONS (default: 20)
rounds of:
Append N entries to a WAL (N chosen by seeded PRNG)
SIGKILLthe writer goroutine at a random offsetOpen the WAL with
OpenWAL(recovery path)Assert recovered entries ≤ written entries (no phantom data)
Assert no sequence gaps in recovered entries
All iterations use the same seed for reproducibility. The seed is captured in the evidence bundle, so a failing run can be reproduced exactly:
make portability-crash-harness SEED=12345 ITERATIONS=100
Step 3 — Known-Good WAL Verify¶
TestKnownGoodWALFixture writes exactly 5 entries to a temp WAL.
scripts/portability/wal_verify.py then independently parses the WAL file using
Python (no Go) and asserts:
Each frame has the correct 4-byte big-endian length prefix
Each payload is valid JSON with
seq,written_at,eventfieldsSequence numbers are 1–5 (no gaps, no duplicates)
This verifies the WAL frame format is stable across architectures — if the frame layout drifts (endianness, alignment), the Python verifier catches it.
Step 4 — Atomic Rename Check¶
Writes a known value to target.tmp, renames atomically to target, reads back.
Produces RENAME_ATOMIC: PASS or RENAME_ATOMIC: FAIL in the evidence JSON.
This is a filesystem capability gate: if atomic rename is broken on the target FS+arch combination (e.g. some network filesystems), the cell fails fast rather than producing a subtly incorrect WAL.
Evidence:
scripts/portability/core_matrix.sh:run_cell()(steps 1–4),scripts/portability/crash_harness.sh
Step 0: Prerequisites¶
# Go
export PATH=/home/ubuntu/go/bin:$PATH
go version # must be >= 1.23
# Python 3 (for WAL verifier)
python3 --version
# Docker (optional; only needed for arm64/riscv64 cells)
docker info 2>/dev/null | grep "Server Version" || echo "(Docker not available — non-native cells will SKIP)"
Step 1: Run the Native-Arch Matrix (No QEMU Required)¶
cd /home/ubuntu/vsc_workstation/autonomyops
make portability-matrix
This runs 1 or 2 cells (amd64/ext4, and amd64/xfs if available) with 20 crash harness iterations each.
Expected output (amd64 host, ext4 only):
==> portability matrix: native arch + ext4
Core Matrix seed: 4831762194857362481
Host arch: amd64 Host FS: ext4
Matrix: arches=[amd64 arm64 riscv64] fs=[ext4 xfs] iterations=20
─── cell [amd64/ext4] run-1 ─────────────────────
[1/4] Go unit tests (telemetry + lock)
[1/4] unit tests PASS
[2/4] crash harness (seed=4831762194857362481, iter=20)
[2/4] crash harness PASS
[3/4] WAL verify (known-good WAL)
[3/4] WAL verify PASS
[4/4] atomic rename check
[4/4] atomic rename check PASS
╔═══════════════════════════════════════════════════════════════╗
║ Core Invariant Matrix — Summary ║
╠═══════════════════════════════════════════════════════════════╣
║ amd64/ext4 PASS all 4 steps passed ║
║ amd64/xfs SKIP xfs not native ║
║ arm64/ext4 SKIP non-native arch (--native-only set) ║
║ arm64/xfs SKIP non-native arch (--native-only set) ║
║ riscv64/ext4 SKIP non-native arch (--native-only set) ║
║ riscv64/xfs SKIP non-native arch (--native-only set) ║
╚═══════════════════════════════════════════════════════════════╝
==> portability matrix: native arch + ext4
PASS (1 passed, 5 skipped, 0 failed)
Step 2: Inspect Evidence Artefacts¶
Each cell produces an evidence directory:
ls evidence/amd64/ext4/run-1/
Expected files:
unit_test.log — go test output (telemetry + lock)
crash_harness.log — crash harness output
crash_harness.json — structured result {pass, seed, iterations, cells}
wal_verify.json — WAL frame verification result
wal_verify_known_good.log — TestKnownGoodWALFixture output
rename_check.json — atomic rename result
Read the structured results:
cat evidence/amd64/ext4/run-1/crash_harness.json | python3 -m json.tool
cat evidence/amd64/ext4/run-1/wal_verify.json | python3 -m json.tool
cat evidence/amd64/ext4/run-1/rename_check.json | python3 -m json.tool
Step 3: Reproduce a Specific Run (Seeded)¶
The seed is stable — any run can be reproduced exactly:
# Reproduce a previous run (replace SEED with the value from evidence/amd64/ext4/run-1/crash_harness.json)
make portability-crash-harness SEED=4831762194857362481 ITERATIONS=20
This produces deterministic output. The same seed + iterations always produces the same sequence of crash points, the same number of surviving entries, and the same assertions.
Step 4: Run the Full 6-Cell Matrix (Requires Docker + QEMU)¶
# Check QEMU binfmt_misc registration
ls /proc/sys/fs/binfmt_misc/ | grep -E "qemu|arm|riscv" || \
echo "QEMU not registered — install qemu-user-static and enable binfmt_misc"
# Register QEMU (one-time, requires root):
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# Run full matrix
make portability-matrix-full
Expected summary (all 6 cells, full QEMU environment):
╔═══════════════════════════════════════════════════════════════╗
║ Core Invariant Matrix — Summary ║
╠═══════════════════════════════════════════════════════════════╣
║ amd64/ext4 PASS all 4 steps passed ║
║ amd64/xfs PASS all 4 steps passed ║
║ arm64/ext4 PASS all 4 steps passed ║
║ arm64/xfs PASS all 4 steps passed ║
║ riscv64/ext4 PASS all 4 steps passed ║
║ riscv64/xfs PASS all 4 steps passed ║
╚═══════════════════════════════════════════════════════════════╝
PASS (6 passed, 0 skipped, 0 failed)
Note on riscv64: The crash harness and WAL verifier run inside a Docker image using
linux/riscv64platform via QEMU. Performance is slower (QEMU emulation), but the 4-step verification is identical to native. riscv64 is not yet a tier-1 release target — it is covered by the matrix to prevent regressions as the architecture matures.
Step 5: Portability CI Gate (Strict Mode)¶
The CI gate exits 1 if any non-SKIP cell fails. This is the release certification gate.
make portability-ci-gate
echo "Exit: $?"
Expected: Exit: 0
If a cell fails, the gate prints a summary of failing cells and exits 1. The evidence directory contains logs for post-mortem analysis.
Step 6: Cross-Compile for All Three Architectures¶
The edge binary (edged) can be cross-compiled from any host without a cross-toolchain:
cd /home/ubuntu/vsc_workstation/autonomyops/edge
for arch in amd64 arm64 riscv64; do
out="/tmp/edged_linux_${arch}"
echo -n "Building ${arch}... "
GOWORK=off GOOS=linux GOARCH=${arch} CGO_ENABLED=0 \
/home/ubuntu/go/bin/go build -o "$out" ./cmd/edged
echo "OK ($(du -sh "$out" | cut -f1))"
file "$out"
done
Expected:
Building amd64... OK (9.4M)
/tmp/edged_linux_amd64: ELF 64-bit LSB executable, x86-64, statically linked
Building arm64... OK (8.9M)
/tmp/edged_linux_arm64: ELF 64-bit LSB executable, ARM aarch64, statically linked
Building riscv64... OK (9.1M)
/tmp/edged_linux_riscv64: ELF 64-bit LSB executable, UCB RISC-V, statically linked
All three binaries are statically linked (no libc dependency). This is the
“run everywhere” property: copy the binary to the target system, configure
edge.toml, and start.
Implementation Status¶
Feature |
Status |
Notes |
|---|---|---|
amd64 production support |
✅ Implemented |
Tier-1; full test coverage |
arm64 support |
✅ Implemented |
Tested via QEMU matrix; production-ready |
riscv64 support |
✅ In matrix |
QEMU-tested; not yet tier-1 release target |
ext4 support |
✅ Implemented |
Native; atomic rename + mmap verified |
xfs support |
✅ In matrix |
Requires loop device or xfs host for full cell |
Cross-compilation |
✅ Implemented |
Single |
Zero CGO |
✅ Implemented |
Verified by |
Randomised crash harness |
✅ Implemented |
|
Container images (multi-arch) |
Roadmap |
Docker manifests for amd64+arm64+riscv64 |
Native riscv64 hardware testing |
Roadmap |
No hardware-in-the-loop CI yet |
Windows / macOS edge support |
Not planned |
Linux-only; |
Troubleshooting¶
Symptom |
Cause |
Fix |
|---|---|---|
|
binfmt_misc not registered |
Run |
|
No xfs mount and not root |
Use |
Crash harness fails intermittently |
Race in fsync/fdatasync on tmpfs |
Run on ext4; tmpfs doesn’t guarantee ordering |
|
Host CGO toolchain missing |
Set |
riscv64 cell very slow |
QEMU emulation overhead |
Expected; QEMU is ≈30–50× slower than native |
What Just Happened¶
Established the portability claim: 3 arches × 2 filesystems = 6 cells, all testable
Ran the native-arch matrix (no QEMU, no root) and read the evidence artefacts
Understood each cell’s 4 steps (unit tests, crash harness, WAL verify, rename check)
Verified that the WAL frame format is stable across architectures (Python verifier)
Demonstrated cross-compilation for all three targets from a single host
Established which features are implemented, in matrix, and on the roadmap
Evidence Links¶
Claim |
File |
Symbol |
|---|---|---|
Matrix driver (6 cells) |
|
|
Randomised crash harness |
|
|
WAL frame format (Python verifier) |
|
Frame parser |
CGO-free scan |
|
Dependency scanner |
Mission-layer import ban |
|
INV-10 rule |
Cross-arch targets |
|
|
Portability CI gate |
|
|