Container Hardening + Syscall Mediation¶

Audience: operators turning on, observing, or recovering the per-workload defense-in-depth layers that sit below /v1/tool — seccomp profile, LD_PRELOAD libc-wrapper shim, Linux-capability drops, read-only rootfs, and a hermetic bypass-resistance test that proves the layers compose. This is the runtime-side companion to launch-level policy mediation: the ROS 2 Governed Bridge runbook covers per-message DDS governance; this runbook covers the syscall boundary between the workload process and the kernel.

The hardening is opt-in per layer with one exception: --seccomp-profile is on by default (the binary embeds a shipped starter profile). Every other layer is operator-set per invocation. The decision to apply each layer should be made together with the threat model in Phases and threat model below.

Walking through it first? Start at Tutorial — Container hardening hands-on; it turns each layer on against a simple workload + runs the bypass-resistance test. The runbook below assumes you already know what each layer does and need to ship it.

Prerequisites¶

docker on PATH. Every layer in this runbook applies via docker flags or container env injection — they have no native-subprocess analog (the runtime fails loudly with ErrHardeningRequiresContainer rather than silently degrade — see Failure modes below).
ghcr.io/autonomyops/adk-ros2-runtime:<version> present locally. Pull with docker pull ghcr.io/autonomyops/adk-ros2-runtime:latest, or build from source via docker build -t ghcr.io/autonomyops/adk-ros2-runtime:local -f demo/ros2-runtime/Dockerfile .. The shipped image bakes the LD_PRELOAD shim at /usr/local/lib/libautonomy_preload.so (#960 Phase 4b-2). Custom images need to bake it themselves or download the per-arch release asset (see Path B in the shim README).
A policy bundle that declares a dlopen_allowlist block (manifest schema v1.5) if you plan to enable Phase 2 — see bundle/manifest.go for the schema. Phase 1 / 3 / 4 / 5 do not require a manifest update.

Phases and threat model¶

The hardening is layered defense-in-depth: each phase closes a specific class of bypass that the others cannot statically discriminate. Operators choose how many layers to apply based on the workload’s trust level.

Phase	Layer	Closes	Default
1	`--seccomp-profile <path>`	Container-escape primitives the kernel sees regardless of path (`mount`, `setns`, `unshare`, `ptrace`, `kexec_*`, `bpf`, `perf_event_open`, …)	on — embedded starter profile applied unless `--no-seccomp` is set
2	`--dlopen-allowlist-from-bundle <path>` + LD_PRELOAD shim	Untrusted C++ / ROS 2 plugin loading via `dlopen()` outside an allowlist	off
3	`--preload-connect-allowlist <ip>:<port>,…` + LD_PRELOAD shim	Egress destinations at the user-space layer (complements egress DLP scanner from #710 — that gates payload content; this gates the destination set)	off
4	LD_PRELOAD shim (`--ld-preload`) + `--preload-exec-allowlist`	Subprocess spawn outside an allowlist (`execve`, `execveat`) — path-name discrimination seccomp can’t do statically	off
5a	`--cap-drop <CAP>`	Linux capabilities the workload doesn’t need (`NET_RAW`, `SYS_PTRACE`, `SYS_ADMIN`, …)	off
5b	`--read-only-rootfs` + `--tmpfs <path>`	Workload’s writes to the container rootfs; only the explicit `--tmpfs` mount is writable	off
6	`runtime/exec/composition_integration_test.go`	Proof that all layers compose without conflict against a deliberately-malicious workload	hermetic test

What this does NOT do¶

Does not replace /v1/tool. Application-layer policy mediation remains the primary trust boundary. The syscall layer is defense-in-depth so a compromised process can’t escape /v1/tool by calling libc directly.
Does not gate trusted runtime processes (the Go runtime itself, the orchestrator). Only the workload subprocess autonomy run spawns.
Does not require an LSM (AppArmor, SELinux). The baseline is portable to any docker/podman host. AppArmor profiles are a follow-up if specific deployments need them.
Does not provide language-level interception for Python (sys.meta_path, monkey-patched subprocess, etc.). That’s the companion epic #961 — depends on this one for defense-in-depth.

Phase 1 — Seccomp profile (default-on)¶

The CLI applies the embedded starter profile to the workload container unless --no-seccomp is set. The starter denies the 21 syscalls that fall outside any normal application’s needs: mount, umount, umount2, pivot_root, setns, unshare, mknod, mknodat, kexec_load, kexec_file_load, init_module, finit_module, delete_module, iopl, ioperm, swapon, swapoff, reboot, bpf, perf_event_open, ptrace.

Enable (default; nothing to do)¶

autonomy ros2 run --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py
# → docker --security-opt seccomp=<embedded-starter-path>

Customize (operator-supplied profile)¶

autonomy ros2 run --seccomp-profile /etc/autonomy/custom.seccomp.json \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py

The path must be absolute. The shipped starter is at runtime/seccomp/runtime-starter.seccomp.json — copy it as a starting point.

Opt out (audited)¶

autonomy ros2 run --no-seccomp \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py

--no-seccomp emits a loud stderr warning containing the literal Audit event: seccomp-opt-out + a WAL marker (markers.WALPass("seccomp-opt-out", ...)). Operator log scrapers grep the suffix; the audit pipeline records the WAL frame. Both are contract — a future “softening” edit fails CI via the TestResolveSeccompProfile_OptOut drift guard.

Verify¶

The workload exits with Operation not permitted (EPERM, errno 1) when it attempts a denied syscall. Check stderr for the kernel- side denial signal. To prove the profile is actually applied, the hermetic test runtime/exec/seccomp_integration_test.go runs a workload that calls chmod under a profile that denies exactly that syscall + asserts the exit code.

Mutual-exclusion guard¶

--no-seccomp and --seccomp-profile are mutually exclusive — the runtime errors before any side effect rather than silently preferring one. --no-seccomp is also refused on the plain-subprocess path and on ros2.launch --force-native (no container to apply the profile to, no audit frame to record the opt-out — see Failure modes).

Phase 2 — `dlopen` allowlist¶

The LD_PRELOAD shim’s dlopen() wrapper denies any .so whose absolute path is not in AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST. This is the untrusted C++ plugin mitigation: ROS 2 component loading, any workload that hosts third-party .so plugins, etc.

The allowlist is sourced from the workload bundle’s v1.5 manifest dlopen_allowlist block:

{
  "schema_version": "1.5",
  "dlopen_allowlist": {
    "paths": [
      "/usr/lib/x86_64-linux-gnu/libstdc++.so.6",
      "/opt/ros/humble/lib/librclcpp.so",
      "..."
    ]
  }
}

Enable¶

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --dlopen-allowlist-from-bundle /var/lib/autonomy/bundles/my-workload.tar \
    launch demo_robot arm_demo.launch.py

The CLI loads the manifest, joins the paths with : (PATH-style), and sets AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST on the workload container env. The shim wrapper enforces.

Verify¶

Stderr lines from the workload when a non-allowlisted .so is attempted:

autonomy-preload: dlopen(/usr/lib/x86_64-linux-gnu/libnsl.so.1) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST

dlopen() returns NULL (libc convention) + errno=EPERM. The workload’s existing “library not available” error path sees it as a normal load failure; the stderr log distinguishes “denied by policy” from “library not present”.

Customize directly (without a bundle)¶

autonomy ros2 run does not expose a generic env-passthrough flag, so the AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST env var can’t be set directly through it — the supported public surface for dlopen- allowlist injection is --dlopen-allowlist-from-bundle <bundle> (production path) or docker run directly (dev / ad-hoc path).

For dev iteration without packaging a bundle, run the canonical adk-ros2-runtime image with docker run:

docker run --rm -it \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=/lib/x86_64-linux-gnu/libc.so.6:/lib/aarch64-linux-gnu/libc.so.6 \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py

Once the allowlist is stable, promote it into the bundle’s manifest v1.5 dlopen_allowlist.paths block and run through autonomy ros2 run --dlopen-allowlist-from-bundle (Phase 2 above) for the audited, policy-tied production path.

Default-allow contract¶

The shim is opt-in defense-in-depth: empty or unset env = pass-through. A workload running with the shim baked in but without AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST set behaves as if the wrapper weren’t there. This is intentional — default-deny on unset would brick every workload that loaded the shim before declaring its dlopen surface.

Phase 3 — Egress destination gating¶

The shim’s connect() wrapper denies any IPv4 OR IPv6 destination not in AUTONOMY_PRELOAD_CONNECT_ALLOWLIST. Loopback is always allowed regardless (/v1/tool lives on loopback and would be bricked otherwise) — IPv4 127.0.0.0/8, IPv6 ::1, and IPv4- mapped IPv6 ::ffff:127.0.0.0/96 (for dual-stack workloads that connect to 127.0.0.1 through an IPv6 socket). AF_UNIX local IPC sockets continue to pass through unconditionally.

This is the kernel-layer complement to the application-layer egress DLP scanner from #710: DLP scans payload content; this gates the destination set.

Allowlist entry forms (five accepted shapes, joined with , NO_PROXY-style):

form	example	meaning
`<ipv4>:<port>`	`10.0.0.5:443`	exact IPv4 + port
`<ipv4>/<prefix>:<port>`	`10.42.0.0/16:8080`	IPv4 CIDR + port
`[<ipv6>]:<port>`	`[fe80::1]:443`	exact IPv6 + port (RFC 3986 bracket form)
`[<ipv6>/<prefix>]:<port>`	`[2001:db8::/32]:443`	IPv6 CIDR + port — prefix INSIDE brackets
`<any-of-above>:*`	`10.0.0.5:*`	any port for that address / CIDR

Enable¶

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-connect-allowlist 10.42.0.5:443,10.42.0.0/16:8080,[fe80::1]:443,[2001:db8::/32]:* \
    launch demo_robot arm_demo.launch.py

The CLI validates each entry before any side effect (rejects empty, whitespace, malformed prefix, IPv6-missing-brackets, unparseable address, out-of-range port).

Verify¶

Stderr lines from the workload when a non-allowlisted destination is attempted:

autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST
autonomy-preload: connect([2606:4700:4700::1111]:443) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST

connect() returns -1 + errno=EPERM. The workload sees this as the standard connect failure; the stderr log distinguishes “denied by policy” from “network unreachable / connection refused”.

Scope notes¶

IPv6 is gated post-#960 (#983). The Phase 3 v1 contract had AF_INET6 pass through unconditionally; operators upgrading existing IPv4-only configs are unaffected but workloads using IPv6 (DDS over IPv6, MAVLink-UDP6, REST against IPv6 hosts) must add explicit entries.
CIDR + :* wildcards ship with the IPv6 extension (#983). Use CIDR for whole-subnet allowances; use :* when a destination’s port is ephemeral (DDS dynamic ports, etc.).
Hostnames not supported. DNS resolution at every connect() would be slow + introduce a DNS-poisoning bypass + add in- process attack surface. Operators pre-resolve at config time or use CIDR for whole-subnet allowances.
Loopback always allowed. Operators wanting to lock down loopback do so via a future stricter flag.

Phase 4 — `execve` / `execveat` allowlist¶

The shim’s execve() + execveat() wrappers deny any binary path not in AUTONOMY_PRELOAD_EXEC_ALLOWLIST. This is subprocess- spawn discrimination by path — seccomp can deny execve as a syscall but cannot statically check the path argument.

execveat() closes the sibling-syscall bypass: a workload that called open(fd) + execveat(fd, "", AT_EMPTY_PATH) would escape the execve-only wrapper. The execveat wrapper resolves the fd via /proc/self/fd/<dirfd> (kernel-canonical, symlinks already resolved) before allowlist matching.

Enable¶

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3,/bin/echo \
    launch demo_robot arm_demo.launch.py

Entries are absolute paths, comma-separated. Joined into colon-separated wire format on the env (PATH-style) before injection.

Verify¶

Stderr lines from the workload when a non-allowlisted binary is attempted:

autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST
autonomy-preload: execveat(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST

execve / execveat returns -1 + errno=EPERM. The workload sees this as a standard exec failure; the stderr log names which sibling variant was attempted.

Resolution failures (execveat)¶

If execveat() is called with a malformed (dirfd, pathname, flags) tuple that can’t be resolved to an absolute path (empty pathname without AT_EMPTY_PATH, unresolvable fd, etc.), the wrapper denies the call BEFORE reaching the allowlist check + emits a separate log line so the operator can distinguish “denied by allowlist” from “denied because we couldn’t resolve to check”:

autonomy-preload: execveat(dirfd=-100, pathname="", flags=0x0) denied — could not resolve to absolute path for allowlist check

Phase 5a — Linux capability drop¶

--cap-drop <NAME> drops a capability from the workload container’s effective set. Repeat or comma-separate; common entries:

NET_RAW — packet sockets (AF_PACKET, SOCK_RAW). Drop unless the workload legitimately sniffs raw frames.
SYS_PTRACE — ptrace + process_vm_readv. Drop unless the workload is a debugger.
SYS_ADMIN — wide grab bag (mount, set hostname, bpf_load_program, …). Drop unless the workload genuinely needs filesystem mounts (rare).
ALL — drop every capability docker normally grants. The workload then inherits exactly what its uid + the kernel defaults allow.

Enable¶

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    launch demo_robot arm_demo.launch.py

The CLI validates each capability name against a known-good list before rendering the docker spawn — docker itself accepts unknown names silently, leaving the cap set wider than intended. A typo fails loudly at the autonomy layer.

Verify¶

The workload’s attempt at the dropped capability returns EPERM. For NET_RAW:

int sock = socket(AF_PACKET, SOCK_RAW, htons(0x0003));
// sock < 0; errno = EPERM

Phase 5b — Read-only rootfs + tmpfs¶

--read-only-rootfs mounts the workload container’s rootfs read-only (docker --read-only). The workload’s writes to any non-mounted path return EROFS. The only writable path is what the operator declares via --tmpfs <path>:

--tmpfs /tmp — common: any workload that writes scratch to /tmp.
--tmpfs /var/log/myapp — workload-specific log dir.
Repeat or comma-separate for multiple mounts.

Enable¶

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --read-only-rootfs --tmpfs /tmp \
    launch demo_robot arm_demo.launch.py

Verify¶

Write outside /tmp:

docker exec -it <workload> sh -c 'echo x > /etc/foo'
# sh: 1: cannot create /etc/foo: Read-only file system

Write inside /tmp:

docker exec -it <workload> sh -c 'echo x > /tmp/foo && cat /tmp/foo'
# x

Why this matters with the shim¶

--read-only-rootfs defeats the symlink-redirection bypass for the execve/execveat allowlist: an attacker who could plant a symlink at an allowlisted absolute path to a denied binary would escape the wrapper (the wrapper trusts absolute paths verbatim — execveat’s /proc/self/fd branch canonicalizes, but the absolute-path branch does not). With --read-only-rootfs, the attacker can’t plant the symlink in the first place.

Phase 6 — Bypass-resistance composition test¶

Hermetic integration test that proves all 5 layers compose without conflict + collectively resist a workload actively trying to escape. Mirrors the bypass-resistance test design from #938 3-D.2 / #953 (positive control + denial + liveness gate).

Source: runtime/exec/composition_integration_test.go.

What the test does¶

Builds a hermetic test image at test-time (debian:bookworm-slim plus the shim plus a deliberately-malicious workload binary). Spawns the container with all 5 layers active simultaneously. Asserts:

6 attack vectors all DENIED with the exact expected reason (per-vector EPERM from shim/seccomp/cap-drop, or EROFS from rootfs-RO).
2 positive controls (/tmp write, loopback connect) both OK — proves the workload isn’t bricked by the hardening.
Both liveness markers (WORKLOAD_START, WORKLOAD_END) present — catches the false-pass on “container never ran”.
All 3 shim canonical log lines (autonomy-preload: execve(...), autonomy-preload: dlopen(...), autonomy-preload: connect(...)) present in stderr — proves the shim was the agent of denial, not a kernel/libc error of similar shape.

Running it¶

The test is gated on linux + docker daemon + !testing.Short():

go test ./runtime/exec/ -run TestComposition_BypassResistance -v -timeout 300s

~20 seconds end-to-end on a warm-cache host. Skips automatically on non-Linux or when docker is unreachable.

C++ workloads — deriving the `dlopen` allowlist from audit traffic¶

C++ workloads — and ROS 2 components / pluginlib consumers in particular — share a workflow problem the runbook’s Phase 2 section can’t solve in the abstract: an rclcpp_components container or a pluginlib-backed node can pull in dozens of .so files during composition, and the operator usually doesn’t know the full set ahead of time. Writing a manifest dlopen_allowlist by hand against a workload like that is a guess-then-debug loop unless you build the list from observed loads.

The post-#960 5/5 hardening slate added the two pieces the C++ workflow needs:

AUTONOMY_PRELOAD_DLOPEN_AUDIT (#985) — opt-in forensics trail that emits one line per dlopen() / dlmopen() call to stderr with the actual libc outcome (loaded, load_failed, denied). This is the input to the allowlist-derivation workflow below.
dlmopen() sibling-bypass closure (#982) — the same AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST env gates both dlopen() and dlmopen(). C++ ROS 2 plugin loaders that use dlmopen() for per-component link-map namespace isolation (some pluginlib configurations, custom component containers) are covered without a separate config surface.

Workflow: baseline → allowlist → enforce¶

The derivation is a three-step loop. Run it once per workload; re-run after any plugin-set change.

Step 1 — Capture the load baseline¶

Run the workload without an allowlist set + with audit on. The shim passes every dlopen() / dlmopen() through (default- allow on unset allowlist; see Phase 2) but logs every call with the libc outcome.

autonomy ros2 run does not expose a generic env-passthrough flag, so the AUTONOMY_PRELOAD_DLOPEN_AUDIT env var has no canonical CLI flag — use docker run directly against the canonical adk-ros2-runtime image for the derivation loop:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch your-package your-launch.py \
    2>&1 | tee /tmp/dlopen-audit.log

Let the workload run through the operational phases you want to sandbox (component composition, plugin registration, the actual work, shutdown). Stop it cleanly so late-bound dlopen()s in shutdown handlers are captured too.

Why docker run and not autonomy ros2 run? The audit env var is opt-in observability without a dedicated CLI flag — the production hardening flags (--ld-preload, --dlopen-allowlist-from-bundle, --preload-exec-allowlist, --preload-connect-allowlist) cover the supported policy surface. The audit-derivation loop is a dev-time workflow; once the allowlist is stable, Step 3 promotes the result back to the canonical autonomy ros2 run surface.

Step 2 — Extract `result=loaded` paths¶

The audit lines tell you what actually came into memory. The allowlist derivation uses only result=loaded entries — never load_failed (those didn’t end up in memory, including them would be cargo-culting paths the workload doesn’t actually need):

grep '^autonomy-preload-audit: dlopen' /tmp/dlopen-audit.log \
  | grep 'result=loaded' \
  | sed -E 's/.*filename="([^"]+)".*/\1/' \
  | sort -u > /tmp/dlopen-allowlist.txt

# Also include dlmopen loads:
grep '^autonomy-preload-audit: dlmopen' /tmp/dlopen-audit.log \
  | grep 'result=loaded' \
  | sed -E 's/.*filename="([^"]+)".*/\1/' \
  | sort -u >> /tmp/dlopen-allowlist.txt

sort -u -o /tmp/dlopen-allowlist.txt /tmp/dlopen-allowlist.txt

Inspect the result — it should look like the operator’s mental model of the workload’s plugin surface. ROS 2 component container baselines typically run 30-80 entries; pure pluginlib nodes 10-30; plain C++ binaries with vendor SDKs 5-15.

Step 3 — Enforce + verify (dev iteration)¶

For the dev derivation loop, inject the allowlist via docker run with the audit trail still on so you can confirm every previously- loaded lib is still result=loaded:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(tr '\n' ':' < /tmp/dlopen-allowlist.txt) \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch your-package your-launch.py \
    2>&1 | tee /tmp/dlopen-enforce.log

Verify:

# Every previously-loaded lib is still result=loaded:
grep 'result=loaded' /tmp/dlopen-enforce.log | wc -l

# Zero shim denials — if the workload's plugin set is stable,
# the enforcement run should not surface any:
grep 'denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST' /tmp/dlopen-enforce.log

If denied by lines appear, the workload added a load between baseline + enforcement runs — either re-run the baseline capture or add the missing path. The autonomy-preload-audit: line for each denial says result=denied so you have the path verbatim.

Step 4 — Promote to bundle manifest (production)¶

Once the derived list is stable, fold it into the workload bundle’s manifest.json — schema v1.5 introduces the dlopen_allowlist block. The bundle directory layout the loader expects is <bundle-dir>/manifest.json + <bundle-dir>/policies/…; autonomy bundle pack then tars the directory into a single .tar for --dlopen-allowlist-from-bundle / --policy consumption (see bundle/manifest.go for the schema struct and cmd/autonomy/commands/dlopen_allowlist.go for the loader):

{
  "kind": "bundle",
  "schema_version": "1.5",
  "name": "your-workload",
  "version": "0.1.0",
  "channel": "dev",
  "min_adk_version": "1.0",
  "dlopen_allowlist": {
    "paths": [
      "/lib/x86_64-linux-gnu/libc.so.6",
      "/lib/x86_64-linux-gnu/libdl.so.2",
      "/opt/ros/jazzy/lib/librcl.so"
    ]
  }
}

Each path must be absolute and end in .so or .so.<digits> (.so.1, .so.42.3); globs are not supported in v1.5 (every entry is exact-match).

Production runs then use the canonical autonomy ros2 run surface — --dlopen-allowlist-from-bundle (explicit) or --policy (auto-source from the same v1.5 manifest, Phase 2b-3 #980):

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --policy ./your-bundle.tar \
    launch your-package your-launch.py

Reading the audit trail¶

`result=`	meaning	when investigating
`loaded`	allowlist passed AND libc returned a non-NULL handle — the `.so` is in memory	Expected; include in derived allowlist
`load_failed`	allowlist passed BUT libc returned NULL (bad ELF, missing transitive dep, namespace conflict, ABI drift)	Investigate — the workload tried to load but the file couldn’t be brought in; correlate by `filename` + timestamp with the workload’s own `dlerror()` stderr
`denied`	allowlist rejected; libc never called	Expected during enforcement runs only — appears alongside the canonical `autonomy-preload: dlopen(...) denied by ...` line

load_failed is the most operator-useful signal. The shim does not consume dlerror() (that would clear the error buffer and break the caller’s own diagnostic), so the workload’s own log line continues to carry the human-readable reason — correlate the audit line and the workload line by timestamp + filename to attribute the failure.

`dlmopen` and link-map namespaces¶

If your workload uses dlmopen() for per-component isolation, the audit line shape includes the link-map namespace argument:

autonomy-preload-audit: dlmopen(lmid=-1, filename="...", flags=0x1, result=loaded)

lmid=-1 is LM_ID_NEWLM (the new-namespace marker — most pluginlib and ROS 2 component-container configurations use this). lmid=0 is LM_ID_BASE (the main program’s namespace). Operator-supplied Lmid_t values render as their numeric namespace ID — useful when investigating namespace-leakage / cross-namespace ABI drift.

Common patterns to watch for¶

Workload shape	Typical audit entries	Notes
`rclcpp_components` container	30-80 `.so` from `/opt/ros/<distro>/lib/`, `/usr/lib/<arch>/`	Vendor packages (DDS implementations, message types) dominate
`pluginlib`-backed node	10-30 `.so`; mix of `lmid=0` and `lmid=-1`	Custom pluginlib loaders may use `dlmopen` for ABI isolation
Plain C++ binary with vendor SDK	5-15 `.so`; vendor lib + its transitive deps	Vendor SDKs often `dlopen` their own optional modules at runtime
MAVLink / GStreamer / OpenCV	10-40 `.so`; many late-bound	Codec backends + protocol modules selected at runtime — capture across multiple operational phases

Failure modes and recovery¶

`ErrHardeningRequiresContainer`¶

ros2: --seccomp-profile / --cap-drop / --read-only-rootfs / --tmpfs /
--ld-preload / --preload-exec-allowlist / --preload-connect-allowlist /
bundle-sourced dlopen_allowlist require the container execution path
(they apply via docker flags / env injection and have no analog on the
native subprocess path). The resolved mode is native, either because
--force-native was set or because Docker is unavailable.

Cause: A hardening flag was set but the dispatch resolved to native (no docker available + no --image, or --force-native). The flag has no effect on native — silent acceptance would be a “looks hardened, isn’t” trap.

Recovery: drop the flag for native dispatch, OR install docker AND pass --image <tag> for container dispatch.

`--no-seccomp` refused on native¶

--no-seccomp cannot be used with native dispatch — the native
subprocess path has no seccomp profile to opt out of, so no
warning fires and no WAL audit marker would be emitted to
record the opt-out (the flag's documented contract). Silently
accepting --no-seccomp on native would create a "looks audited,
isn't" trap.

Cause: Same shape as the previous, but for the --no-seccomp opt-out. The flag’s documented contract is “warn + WAL audit” — neither can fire on native.

Recovery: drop --no-seccomp (the native path is already running without seccomp), OR ensure docker + --image so the container path can apply the default-on profile AND honor the opt-out with the audit frame.

Workload exits with EPERM but you don’t know why¶

Run with -v style stderr capture + grep for the shim canonical prefix:

autonomy ros2 run ... 2>&1 | grep '^autonomy-preload:'

Every shim-mediated denial emits exactly one canonical line per denied call:

autonomy-preload: <op>(<arg>) denied by <env-var>

If no autonomy-preload: lines appear but the workload still EPERMs, the denial came from seccomp or cap-drop (kernel-layer) — check the kernel audit log:

sudo dmesg | tail -50
# or
sudo journalctl --since "5 minutes ago" | grep -i audit

Library not loading + `dlopen_allowlist` is set¶

If the workload loads a .so that’s not in the bundle’s dlopen_allowlist.paths, the shim denies. The workload sees dlopen() return NULL. Common symptom in ROS 2: a launched node fails with Failed to import: <plugin>.

Recovery: Add the missing path to the bundle’s dlopen_allowlist block, repackage, push the new bundle. For dev-loop iteration without re-packaging, bypass autonomy ros2 run and inject directly via docker run:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(cat bundle-paths.txt | tr '\n' ':') \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py

To audit what dlopens the workload performs on a known-good run, enable the audit trail (AUTONOMY_PRELOAD_DLOPEN_AUDIT; post-#960 #985) and parse result=loaded lines from stderr — see the C++ workloads section below for the full workflow.

Cross-references¶

Tutorial — Container hardening hands-on — walks through enabling each layer against a sample workload.
runtime/preload/README.md — shim wrapper contracts + the 3 paths to get the shim into your image.
runtime/seccomp/README.md — starter profile rationale + how to derive a workload-specific custom profile.
ROS 2 Governed Bridge runbook — the application-layer governance loop this runbook complements.
Issue #960 — the epic that filed this layer’s requirements + acceptance criteria.
Issue #961 — the follow-up epic for Python-specific runtime mediation (depends on this one for defense-in-depth).

Container Hardening + Syscall Mediation¶

Prerequisites¶

Phases and threat model¶

What this does NOT do¶

Phase 1 — Seccomp profile (default-on)¶

Enable (default; nothing to do)¶

Customize (operator-supplied profile)¶

Opt out (audited)¶

Verify¶

Mutual-exclusion guard¶

Phase 2 — dlopen allowlist¶

Enable¶

Verify¶

Customize directly (without a bundle)¶

Default-allow contract¶

Phase 3 — Egress destination gating¶

Enable¶

Verify¶

Scope notes¶

Phase 4 — execve / execveat allowlist¶

Enable¶

Verify¶

Resolution failures (execveat)¶

Phase 5a — Linux capability drop¶

Enable¶

Verify¶

Phase 5b — Read-only rootfs + tmpfs¶

Enable¶

Verify¶

Why this matters with the shim¶

Phase 6 — Bypass-resistance composition test¶

What the test does¶

Running it¶

C++ workloads — deriving the dlopen allowlist from audit traffic¶

Workflow: baseline → allowlist → enforce¶

Step 1 — Capture the load baseline¶

Step 2 — Extract result=loaded paths¶

Step 3 — Enforce + verify (dev iteration)¶

Step 4 — Promote to bundle manifest (production)¶

Reading the audit trail¶

dlmopen and link-map namespaces¶

Common patterns to watch for¶

Other post-#960 5/5 considerations for C++ workloads¶

Failure modes and recovery¶

ErrHardeningRequiresContainer¶

--no-seccomp refused on native¶

Workload exits with EPERM but you don’t know why¶

Library not loading + dlopen_allowlist is set¶

Cross-references¶

Phase 2 — `dlopen` allowlist¶

Phase 4 — `execve` / `execveat` allowlist¶

C++ workloads — deriving the `dlopen` allowlist from audit traffic¶

Step 2 — Extract `result=loaded` paths¶

`dlmopen` and link-map namespaces¶

`ErrHardeningRequiresContainer`¶

`--no-seccomp` refused on native¶

Library not loading + `dlopen_allowlist` is set¶