Tutorial 05 — Portability: Run Everywhere (amd64, arm64, riscv64)¶

Objective: Demonstrate that the edge runtime runs deterministically across three CPU architectures (amd64, arm64, riscv64) and two filesystem types (ext4, xfs) — with a single binary, no CGO, and no platform-specific configuration. The portability matrix is the formal evidence artefact for release certification.

What you will demonstrate:

Understand the portability claim and its limits (what is implemented vs. roadmap)
Run the native-arch portability matrix (4 cells: no QEMU required)
Understand the full 6-cell matrix (Docker + QEMU for cross-arch)
Inspect the evidence artefacts produced by each cell
Read the CGO-free dependency analysis
Run the portability CI gate (strict exit-1 mode)

Time: ~15 minutes (native-only, no QEMU)

Portability Matrix¶

The matrix has 6 cells: 3 architectures × 2 filesystems.

              ext4          xfs
amd64        [P] PASS      [P] PASS
arm64        [P] PASS      [P] PASS
riscv64      [P] PASS      [P] PASS

Status key:
  [P] PASS — all 4 steps passed
  [F] FAIL — at least one step failed
  [S] SKIP — arch/FS unavailable (non-native without QEMU, or xfs requires root)

Native (amd64): All cells run directly — no Docker, no QEMU, no root required for ext4. xfs requires a loop device (root) or a host xfs mount.

Cross-arch (arm64, riscv64): Cells run inside Docker containers with QEMU via binfmt_misc. If Docker or QEMU registration is absent the cell is marked SKIP, not FAIL. The CI gate does not require SKIP cells to pass.

Evidence: scripts/portability/core_matrix.sh, Makefile:portability-matrix, Makefile:portability-matrix-full

The CGO-Free Contract¶

All edge packages compile with CGO_ENABLED=0. The dependency tree contains no C library calls:

Package	CGO needed?	Notes
`go.etcd.io/bbolt` (BoltDB)	No	Pure Go; uses mmap + `os.File.Sync()`
`github.com/zeebo/blake3`	No	Pure Go (assembly acceleration optional, not required)
`golang.org/x/sys/unix`	No	Pure Go syscall wrappers; `uname(2)` uses `syscall.RawSyscall`
`github.com/prometheus/client_golang`	No	Pure Go HTTP + metric registry
`github.com/spf13/cobra`	No	Pure Go CLI
`gopkg.in/yaml.v3`	No	Pure Go YAML parser

No #cgo directives appear in any imported package. Cross-compilation is a GOARCH=<target> go build ./... command — no cross-toolchain, no sysroot.

Evidence: edge/go.mod, edge/ci/scan_prohibited/main.go (INV-10 import ban), edge/ci/scan_dependencies/main.go

What Each Cell Tests¶

Each matrix cell runs four steps in sequence. A single step failure fails the cell.

Step 1 — Go Unit Tests¶

On the native arch, runs:

telemetry/ module: WAL durability, safe-point recovery, OTLP drain
lock/ module: BLAKE3 fingerprint stability, canonical serialisation

On cross-arch cells (inside QEMU Docker image), unit tests are run if Docker/QEMU is available; otherwise this step logs a note and the other 3 steps proceed.

Step 2 — Randomised Crash Harness¶

TestCrashHarness_Randomized (telemetry/crash_harness_test.go) runs ITERATIONS (default: 20) rounds of:

Append N entries to a WAL (N chosen by seeded PRNG)
SIGKILL the writer goroutine at a random offset
Open the WAL with OpenWAL (recovery path)
Assert recovered entries ≤ written entries (no phantom data)
Assert no sequence gaps in recovered entries

All iterations use the same seed for reproducibility. The seed is captured in the evidence bundle, so a failing run can be reproduced exactly:

make portability-crash-harness SEED=12345 ITERATIONS=100

Step 3 — Known-Good WAL Verify¶

TestKnownGoodWALFixture writes exactly 5 entries to a temp WAL. scripts/portability/wal_verify.py then independently parses the WAL file using Python (no Go) and asserts:

Each frame has the correct 4-byte big-endian length prefix
Each payload is valid JSON with seq, written_at, event fields
Sequence numbers are 1–5 (no gaps, no duplicates)

This verifies the WAL frame format is stable across architectures — if the frame layout drifts (endianness, alignment), the Python verifier catches it.

Step 4 — Atomic Rename Check¶

Writes a known value to target.tmp, renames atomically to target, reads back. Produces RENAME_ATOMIC: PASS or RENAME_ATOMIC: FAIL in the evidence JSON.

This is a filesystem capability gate: if atomic rename is broken on the target FS+arch combination (e.g. some network filesystems), the cell fails fast rather than producing a subtly incorrect WAL.

Evidence: scripts/portability/core_matrix.sh:run_cell() (steps 1–4), scripts/portability/crash_harness.sh

Step 0: Prerequisites¶

# Go
export PATH=/home/ubuntu/go/bin:$PATH
go version  # must be >= 1.23

# Python 3 (for WAL verifier)
python3 --version

# Docker (optional; only needed for arm64/riscv64 cells)
docker info 2>/dev/null | grep "Server Version" || echo "(Docker not available — non-native cells will SKIP)"

Step 1: Run the Native-Arch Matrix (No QEMU Required)¶

cd /home/ubuntu/vsc_workstation/autonomyops
make portability-matrix

This runs 1 or 2 cells (amd64/ext4, and amd64/xfs if available) with 20 crash harness iterations each.

Expected output (amd64 host, ext4 only):

==> portability matrix: native arch + ext4
Core Matrix seed: 4831762194857362481
Host arch: amd64  Host FS: ext4
Matrix: arches=[amd64 arm64 riscv64]  fs=[ext4 xfs]  iterations=20

─── cell [amd64/ext4] run-1 ─────────────────────
  [1/4] Go unit tests (telemetry + lock)
  [1/4] unit tests PASS
  [2/4] crash harness (seed=4831762194857362481, iter=20)
  [2/4] crash harness PASS
  [3/4] WAL verify (known-good WAL)
  [3/4] WAL verify PASS
  [4/4] atomic rename check
  [4/4] atomic rename check PASS

╔═══════════════════════════════════════════════════════════════╗
║          Core Invariant Matrix — Summary                     ║
╠═══════════════════════════════════════════════════════════════╣
║  amd64/ext4   PASS  all 4 steps passed                       ║
║  amd64/xfs    SKIP  xfs not native                           ║
║  arm64/ext4   SKIP  non-native arch (--native-only set)      ║
║  arm64/xfs    SKIP  non-native arch (--native-only set)      ║
║  riscv64/ext4 SKIP  non-native arch (--native-only set)      ║
║  riscv64/xfs  SKIP  non-native arch (--native-only set)      ║
╚═══════════════════════════════════════════════════════════════╝
==> portability matrix: native arch + ext4
PASS  (1 passed, 5 skipped, 0 failed)

Step 2: Inspect Evidence Artefacts¶

Each cell produces an evidence directory:

ls evidence/amd64/ext4/run-1/

Expected files:

unit_test.log            — go test output (telemetry + lock)
crash_harness.log        — crash harness output
crash_harness.json       — structured result {pass, seed, iterations, cells}
wal_verify.json          — WAL frame verification result
wal_verify_known_good.log — TestKnownGoodWALFixture output
rename_check.json        — atomic rename result

Read the structured results:

cat evidence/amd64/ext4/run-1/crash_harness.json | python3 -m json.tool
cat evidence/amd64/ext4/run-1/wal_verify.json | python3 -m json.tool
cat evidence/amd64/ext4/run-1/rename_check.json | python3 -m json.tool

Step 3: Reproduce a Specific Run (Seeded)¶

The seed is stable — any run can be reproduced exactly:

# Reproduce a previous run (replace SEED with the value from evidence/amd64/ext4/run-1/crash_harness.json)
make portability-crash-harness SEED=4831762194857362481 ITERATIONS=20

This produces deterministic output. The same seed + iterations always produces the same sequence of crash points, the same number of surviving entries, and the same assertions.

Step 4: Run the Full 6-Cell Matrix (Requires Docker + QEMU)¶

# Check QEMU binfmt_misc registration
ls /proc/sys/fs/binfmt_misc/ | grep -E "qemu|arm|riscv" || \
  echo "QEMU not registered — install qemu-user-static and enable binfmt_misc"

# Register QEMU (one-time, requires root):
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

# Run full matrix
make portability-matrix-full

Expected summary (all 6 cells, full QEMU environment):

╔═══════════════════════════════════════════════════════════════╗
║          Core Invariant Matrix — Summary                     ║
╠═══════════════════════════════════════════════════════════════╣
║  amd64/ext4   PASS  all 4 steps passed                       ║
║  amd64/xfs    PASS  all 4 steps passed                       ║
║  arm64/ext4   PASS  all 4 steps passed                       ║
║  arm64/xfs    PASS  all 4 steps passed                       ║
║  riscv64/ext4 PASS  all 4 steps passed                       ║
║  riscv64/xfs  PASS  all 4 steps passed                       ║
╚═══════════════════════════════════════════════════════════════╝
PASS  (6 passed, 0 skipped, 0 failed)

Note on riscv64: The crash harness and WAL verifier run inside a Docker image using linux/riscv64 platform via QEMU. Performance is slower (QEMU emulation), but the 4-step verification is identical to native. riscv64 is not yet a tier-1 release target — it is covered by the matrix to prevent regressions as the architecture matures.

Step 5: Portability CI Gate (Strict Mode)¶

The CI gate exits 1 if any non-SKIP cell fails. This is the release certification gate.

make portability-ci-gate
echo "Exit: $?"

Expected: Exit: 0

If a cell fails, the gate prints a summary of failing cells and exits 1. The evidence directory contains logs for post-mortem analysis.

Step 6: Cross-Compile for All Three Architectures¶

The edge binary (edged) can be cross-compiled from any host without a cross-toolchain:

cd /home/ubuntu/vsc_workstation/autonomyops/edge

for arch in amd64 arm64 riscv64; do
  out="/tmp/edged_linux_${arch}"
  echo -n "Building ${arch}... "
  GOWORK=off GOOS=linux GOARCH=${arch} CGO_ENABLED=0 \
    /home/ubuntu/go/bin/go build -o "$out" ./cmd/edged
  echo "OK ($(du -sh "$out" | cut -f1))"
  file "$out"
done

Expected:

Building amd64... OK (9.4M)
/tmp/edged_linux_amd64: ELF 64-bit LSB executable, x86-64, statically linked
Building arm64... OK (8.9M)
/tmp/edged_linux_arm64: ELF 64-bit LSB executable, ARM aarch64, statically linked
Building riscv64... OK (9.1M)
/tmp/edged_linux_riscv64: ELF 64-bit LSB executable, UCB RISC-V, statically linked

All three binaries are statically linked (no libc dependency). This is the “run everywhere” property: copy the binary to the target system, configure edge.toml, and start.

Implementation Status¶

Feature	Status	Notes
amd64 production support	✅ Implemented	Tier-1; full test coverage
arm64 support	✅ Implemented	Tested via QEMU matrix; production-ready
riscv64 support	✅ In matrix	QEMU-tested; not yet tier-1 release target
ext4 support	✅ Implemented	Native; atomic rename + mmap verified
xfs support	✅ In matrix	Requires loop device or xfs host for full cell
Cross-compilation	✅ Implemented	Single `go build` command; no cross-toolchain
Zero CGO	✅ Implemented	Verified by `edge/ci/scan_dependencies/main.go`
Randomised crash harness	✅ Implemented	`TestCrashHarness_Randomized`; seeded, reproducible
Container images (multi-arch)	Roadmap	Docker manifests for amd64+arm64+riscv64
Native riscv64 hardware testing	Roadmap	No hardware-in-the-loop CI yet
Windows / macOS edge support	Not planned	Linux-only; `unix.Uname` + mmap

Troubleshooting¶

Symptom	Cause	Fix
`QEMU not available for arm64`	binfmt_misc not registered	Run `docker run --rm --privileged multiarch/qemu-user-static --reset -p yes`
`xfs not available`	No xfs mount and not root	Use `--native-fs-only` flag, or run as root with `mkfs.xfs`
Crash harness fails intermittently	Race in fsync/fdatasync on tmpfs	Run on ext4; tmpfs doesn’t guarantee ordering
`CGO_ENABLED=1` cross-compile fails	Host CGO toolchain missing	Set `CGO_ENABLED=0` (all edge packages support it)
riscv64 cell very slow	QEMU emulation overhead	Expected; QEMU is ≈30–50× slower than native

What Just Happened¶

Established the portability claim: 3 arches × 2 filesystems = 6 cells, all testable
Ran the native-arch matrix (no QEMU, no root) and read the evidence artefacts
Understood each cell’s 4 steps (unit tests, crash harness, WAL verify, rename check)
Verified that the WAL frame format is stable across architectures (Python verifier)
Demonstrated cross-compilation for all three targets from a single host
Established which features are implemented, in matrix, and on the roadmap

Evidence Links¶

Claim	File	Symbol
Matrix driver (6 cells)	`scripts/portability/core_matrix.sh`	`run_cell()`
Randomised crash harness	`telemetry/crash_harness_test.go`	`TestCrashHarness_Randomized`
WAL frame format (Python verifier)	`scripts/portability/wal_verify.py`	Frame parser
CGO-free scan	`edge/ci/scan_dependencies/main.go`	Dependency scanner
Mission-layer import ban	`edge/ci/scan_prohibited/main.go`	INV-10 rule
Cross-arch targets	`Makefile`	`portability-matrix-full`, `PORTABILITY_ARCHES`
Portability CI gate	`Makefile`	`portability-ci-gate`

Back to Index¶

Tutorial Pack — Seed Once, Update Everywhere