Container Hardening + Syscall Mediation

Audience: operators turning on, observing, or recovering the per-workload defense-in-depth layers that sit below /v1/tool — seccomp profile, LD_PRELOAD libc-wrapper shim, Linux-capability drops, read-only rootfs, and a hermetic bypass-resistance test that proves the layers compose. This is the runtime-side companion to launch-level policy mediation: the ROS 2 Governed Bridge runbook covers per-message DDS governance; this runbook covers the syscall boundary between the workload process and the kernel.

The hardening is opt-in per layer with one exception: --seccomp-profile is on by default (the binary embeds a shipped starter profile). Every other layer is operator-set per invocation. The decision to apply each layer should be made together with the threat model in Phases and threat model below.

Walking through it first? Start at Tutorial — Container hardening hands-on; it turns each layer on against a simple workload + runs the bypass-resistance test. The runbook below assumes you already know what each layer does and need to ship it.

Prerequisites

  • docker on PATH. Every layer in this runbook applies via docker flags or container env injection — they have no native-subprocess analog (the runtime fails loudly with ErrHardeningRequiresContainer rather than silently degrade — see Failure modes below).

  • ghcr.io/autonomyops/adk-ros2-runtime:<version> present locally. Pull with docker pull ghcr.io/autonomyops/adk-ros2-runtime:latest, or build from source via docker build -t ghcr.io/autonomyops/adk-ros2-runtime:local -f demo/ros2-runtime/Dockerfile .. The shipped image bakes the LD_PRELOAD shim at /usr/local/lib/libautonomy_preload.so (#960 Phase 4b-2). Custom images need to bake it themselves or download the per-arch release asset (see Path B in the shim README).

  • A policy bundle that declares a dlopen_allowlist block (manifest schema v1.5) if you plan to enable Phase 2 — see bundle/manifest.go for the schema. Phase 1 / 3 / 4 / 5 do not require a manifest update.

Phases and threat model

The hardening is layered defense-in-depth: each phase closes a specific class of bypass that the others cannot statically discriminate. Operators choose how many layers to apply based on the workload’s trust level.

Phase

Layer

Closes

Default

1

--seccomp-profile <path>

Container-escape primitives the kernel sees regardless of path (mount, setns, unshare, ptrace, kexec_*, bpf, perf_event_open, …)

on — embedded starter profile applied unless --no-seccomp is set

2

--dlopen-allowlist-from-bundle <path> + LD_PRELOAD shim

Untrusted C++ / ROS 2 plugin loading via dlopen() outside an allowlist

off

3

--preload-connect-allowlist <ip>:<port>,… + LD_PRELOAD shim

Egress destinations at the user-space layer (complements egress DLP scanner from #710 — that gates payload content; this gates the destination set)

off

4

LD_PRELOAD shim (--ld-preload) + --preload-exec-allowlist

Subprocess spawn outside an allowlist (execve, execveat) — path-name discrimination seccomp can’t do statically

off

5a

--cap-drop <CAP>

Linux capabilities the workload doesn’t need (NET_RAW, SYS_PTRACE, SYS_ADMIN, …)

off

5b

--read-only-rootfs + --tmpfs <path>

Workload’s writes to the container rootfs; only the explicit --tmpfs mount is writable

off

6

runtime/exec/composition_integration_test.go

Proof that all layers compose without conflict against a deliberately-malicious workload

hermetic test

What this does NOT do

  • Does not replace /v1/tool. Application-layer policy mediation remains the primary trust boundary. The syscall layer is defense-in-depth so a compromised process can’t escape /v1/tool by calling libc directly.

  • Does not gate trusted runtime processes (the Go runtime itself, the orchestrator). Only the workload subprocess autonomy run spawns.

  • Does not require an LSM (AppArmor, SELinux). The baseline is portable to any docker/podman host. AppArmor profiles are a follow-up if specific deployments need them.

  • Does not provide language-level interception for Python (sys.meta_path, monkey-patched subprocess, etc.). That’s the companion epic #961 — depends on this one for defense-in-depth.

Phase 1 — Seccomp profile (default-on)

The CLI applies the embedded starter profile to the workload container unless --no-seccomp is set. The starter denies the 21 syscalls that fall outside any normal application’s needs: mount, umount, umount2, pivot_root, setns, unshare, mknod, mknodat, kexec_load, kexec_file_load, init_module, finit_module, delete_module, iopl, ioperm, swapon, swapoff, reboot, bpf, perf_event_open, ptrace.

Enable (default; nothing to do)

autonomy ros2 run --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py
# → docker --security-opt seccomp=<embedded-starter-path>

Customize (operator-supplied profile)

autonomy ros2 run --seccomp-profile /etc/autonomy/custom.seccomp.json \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py

The path must be absolute. The shipped starter is at runtime/seccomp/runtime-starter.seccomp.json — copy it as a starting point.

Opt out (audited)

autonomy ros2 run --no-seccomp \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py

--no-seccomp emits a loud stderr warning containing the literal Audit event: seccomp-opt-out + a WAL marker (markers.WALPass("seccomp-opt-out", ...)). Operator log scrapers grep the suffix; the audit pipeline records the WAL frame. Both are contract — a future “softening” edit fails CI via the TestResolveSeccompProfile_OptOut drift guard.

Verify

The workload exits with Operation not permitted (EPERM, errno 1) when it attempts a denied syscall. Check stderr for the kernel- side denial signal. To prove the profile is actually applied, the hermetic test runtime/exec/seccomp_integration_test.go runs a workload that calls chmod under a profile that denies exactly that syscall + asserts the exit code.

Mutual-exclusion guard

--no-seccomp and --seccomp-profile are mutually exclusive — the runtime errors before any side effect rather than silently preferring one. --no-seccomp is also refused on the plain-subprocess path and on ros2.launch --force-native (no container to apply the profile to, no audit frame to record the opt-out — see Failure modes).

Phase 2 — dlopen allowlist

The LD_PRELOAD shim’s dlopen() wrapper denies any .so whose absolute path is not in AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST. This is the untrusted C++ plugin mitigation: ROS 2 component loading, any workload that hosts third-party .so plugins, etc.

The allowlist is sourced from the workload bundle’s v1.5 manifest dlopen_allowlist block:

{
  "schema_version": "1.5",
  "dlopen_allowlist": {
    "paths": [
      "/usr/lib/x86_64-linux-gnu/libstdc++.so.6",
      "/opt/ros/humble/lib/librclcpp.so",
      "..."
    ]
  }
}

Enable

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --dlopen-allowlist-from-bundle /var/lib/autonomy/bundles/my-workload.tar \
    launch demo_robot arm_demo.launch.py

The CLI loads the manifest, joins the paths with : (PATH-style), and sets AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST on the workload container env. The shim wrapper enforces.

Verify

Stderr lines from the workload when a non-allowlisted .so is attempted:

autonomy-preload: dlopen(/usr/lib/x86_64-linux-gnu/libnsl.so.1) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST

dlopen() returns NULL (libc convention) + errno=EPERM. The workload’s existing “library not available” error path sees it as a normal load failure; the stderr log distinguishes “denied by policy” from “library not present”.

Customize directly (without a bundle)

autonomy ros2 run does not expose a generic env-passthrough flag, so the AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST env var can’t be set directly through it — the supported public surface for dlopen- allowlist injection is --dlopen-allowlist-from-bundle <bundle> (production path) or docker run directly (dev / ad-hoc path).

For dev iteration without packaging a bundle, run the canonical adk-ros2-runtime image with docker run:

docker run --rm -it \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=/lib/x86_64-linux-gnu/libc.so.6:/lib/aarch64-linux-gnu/libc.so.6 \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py

Once the allowlist is stable, promote it into the bundle’s manifest v1.5 dlopen_allowlist.paths block and run through autonomy ros2 run --dlopen-allowlist-from-bundle (Phase 2 above) for the audited, policy-tied production path.

Default-allow contract

The shim is opt-in defense-in-depth: empty or unset env = pass-through. A workload running with the shim baked in but without AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST set behaves as if the wrapper weren’t there. This is intentional — default-deny on unset would brick every workload that loaded the shim before declaring its dlopen surface.

Phase 3 — Egress destination gating

The shim’s connect() wrapper denies any IPv4 OR IPv6 destination not in AUTONOMY_PRELOAD_CONNECT_ALLOWLIST. Loopback is always allowed regardless (/v1/tool lives on loopback and would be bricked otherwise) — IPv4 127.0.0.0/8, IPv6 ::1, and IPv4- mapped IPv6 ::ffff:127.0.0.0/96 (for dual-stack workloads that connect to 127.0.0.1 through an IPv6 socket). AF_UNIX local IPC sockets continue to pass through unconditionally.

This is the kernel-layer complement to the application-layer egress DLP scanner from #710: DLP scans payload content; this gates the destination set.

Allowlist entry forms (five accepted shapes, joined with , NO_PROXY-style):

form

example

meaning

<ipv4>:<port>

10.0.0.5:443

exact IPv4 + port

<ipv4>/<prefix>:<port>

10.42.0.0/16:8080

IPv4 CIDR + port

[<ipv6>]:<port>

[fe80::1]:443

exact IPv6 + port (RFC 3986 bracket form)

[<ipv6>/<prefix>]:<port>

[2001:db8::/32]:443

IPv6 CIDR + port — prefix INSIDE brackets

<any-of-above>:*

10.0.0.5:*

any port for that address / CIDR

Enable

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-connect-allowlist 10.42.0.5:443,10.42.0.0/16:8080,[fe80::1]:443,[2001:db8::/32]:* \
    launch demo_robot arm_demo.launch.py

The CLI validates each entry before any side effect (rejects empty, whitespace, malformed prefix, IPv6-missing-brackets, unparseable address, out-of-range port).

Verify

Stderr lines from the workload when a non-allowlisted destination is attempted:

autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST
autonomy-preload: connect([2606:4700:4700::1111]:443) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST

connect() returns -1 + errno=EPERM. The workload sees this as the standard connect failure; the stderr log distinguishes “denied by policy” from “network unreachable / connection refused”.

Scope notes

  • IPv6 is gated post-#960 (#983). The Phase 3 v1 contract had AF_INET6 pass through unconditionally; operators upgrading existing IPv4-only configs are unaffected but workloads using IPv6 (DDS over IPv6, MAVLink-UDP6, REST against IPv6 hosts) must add explicit entries.

  • CIDR + :* wildcards ship with the IPv6 extension (#983). Use CIDR for whole-subnet allowances; use :* when a destination’s port is ephemeral (DDS dynamic ports, etc.).

  • Hostnames not supported. DNS resolution at every connect() would be slow + introduce a DNS-poisoning bypass + add in- process attack surface. Operators pre-resolve at config time or use CIDR for whole-subnet allowances.

  • Loopback always allowed. Operators wanting to lock down loopback do so via a future stricter flag.

Phase 4 — execve / execveat allowlist

The shim’s execve() + execveat() wrappers deny any binary path not in AUTONOMY_PRELOAD_EXEC_ALLOWLIST. This is subprocess- spawn discrimination by path — seccomp can deny execve as a syscall but cannot statically check the path argument.

execveat() closes the sibling-syscall bypass: a workload that called open(fd) + execveat(fd, "", AT_EMPTY_PATH) would escape the execve-only wrapper. The execveat wrapper resolves the fd via /proc/self/fd/<dirfd> (kernel-canonical, symlinks already resolved) before allowlist matching.

Enable

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3,/bin/echo \
    launch demo_robot arm_demo.launch.py

Entries are absolute paths, comma-separated. Joined into colon-separated wire format on the env (PATH-style) before injection.

Verify

Stderr lines from the workload when a non-allowlisted binary is attempted:

autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST
autonomy-preload: execveat(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST

execve / execveat returns -1 + errno=EPERM. The workload sees this as a standard exec failure; the stderr log names which sibling variant was attempted.

Resolution failures (execveat)

If execveat() is called with a malformed (dirfd, pathname, flags) tuple that can’t be resolved to an absolute path (empty pathname without AT_EMPTY_PATH, unresolvable fd, etc.), the wrapper denies the call BEFORE reaching the allowlist check + emits a separate log line so the operator can distinguish “denied by allowlist” from “denied because we couldn’t resolve to check”:

autonomy-preload: execveat(dirfd=-100, pathname="", flags=0x0) denied — could not resolve to absolute path for allowlist check

Phase 5a — Linux capability drop

--cap-drop <NAME> drops a capability from the workload container’s effective set. Repeat or comma-separate; common entries:

  • NET_RAW — packet sockets (AF_PACKET, SOCK_RAW). Drop unless the workload legitimately sniffs raw frames.

  • SYS_PTRACE — ptrace + process_vm_readv. Drop unless the workload is a debugger.

  • SYS_ADMIN — wide grab bag (mount, set hostname, bpf_load_program, …). Drop unless the workload genuinely needs filesystem mounts (rare).

  • ALL — drop every capability docker normally grants. The workload then inherits exactly what its uid + the kernel defaults allow.

Enable

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    launch demo_robot arm_demo.launch.py

The CLI validates each capability name against a known-good list before rendering the docker spawn — docker itself accepts unknown names silently, leaving the cap set wider than intended. A typo fails loudly at the autonomy layer.

Verify

The workload’s attempt at the dropped capability returns EPERM. For NET_RAW:

int sock = socket(AF_PACKET, SOCK_RAW, htons(0x0003));
// sock < 0; errno = EPERM

Phase 5b — Read-only rootfs + tmpfs

--read-only-rootfs mounts the workload container’s rootfs read-only (docker --read-only). The workload’s writes to any non-mounted path return EROFS. The only writable path is what the operator declares via --tmpfs <path>:

  • --tmpfs /tmp — common: any workload that writes scratch to /tmp.

  • --tmpfs /var/log/myapp — workload-specific log dir.

  • Repeat or comma-separate for multiple mounts.

Enable

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --read-only-rootfs --tmpfs /tmp \
    launch demo_robot arm_demo.launch.py

Verify

Write outside /tmp:

docker exec -it <workload> sh -c 'echo x > /etc/foo'
# sh: 1: cannot create /etc/foo: Read-only file system

Write inside /tmp:

docker exec -it <workload> sh -c 'echo x > /tmp/foo && cat /tmp/foo'
# x

Why this matters with the shim

--read-only-rootfs defeats the symlink-redirection bypass for the execve/execveat allowlist: an attacker who could plant a symlink at an allowlisted absolute path to a denied binary would escape the wrapper (the wrapper trusts absolute paths verbatim — execveat’s /proc/self/fd branch canonicalizes, but the absolute-path branch does not). With --read-only-rootfs, the attacker can’t plant the symlink in the first place.

Phase 6 — Bypass-resistance composition test

Hermetic integration test that proves all 5 layers compose without conflict + collectively resist a workload actively trying to escape. Mirrors the bypass-resistance test design from #938 3-D.2 / #953 (positive control + denial + liveness gate).

Source: runtime/exec/composition_integration_test.go.

What the test does

Builds a hermetic test image at test-time (debian:bookworm-slim plus the shim plus a deliberately-malicious workload binary). Spawns the container with all 5 layers active simultaneously. Asserts:

  • 6 attack vectors all DENIED with the exact expected reason (per-vector EPERM from shim/seccomp/cap-drop, or EROFS from rootfs-RO).

  • 2 positive controls (/tmp write, loopback connect) both OK — proves the workload isn’t bricked by the hardening.

  • Both liveness markers (WORKLOAD_START, WORKLOAD_END) present — catches the false-pass on “container never ran”.

  • All 3 shim canonical log lines (autonomy-preload: execve(...), autonomy-preload: dlopen(...), autonomy-preload: connect(...)) present in stderr — proves the shim was the agent of denial, not a kernel/libc error of similar shape.

Running it

The test is gated on linux + docker daemon + !testing.Short():

go test ./runtime/exec/ -run TestComposition_BypassResistance -v -timeout 300s

~20 seconds end-to-end on a warm-cache host. Skips automatically on non-Linux or when docker is unreachable.

C++ workloads — deriving the dlopen allowlist from audit traffic

C++ workloads — and ROS 2 components / pluginlib consumers in particular — share a workflow problem the runbook’s Phase 2 section can’t solve in the abstract: an rclcpp_components container or a pluginlib-backed node can pull in dozens of .so files during composition, and the operator usually doesn’t know the full set ahead of time. Writing a manifest dlopen_allowlist by hand against a workload like that is a guess-then-debug loop unless you build the list from observed loads.

The post-#960 5/5 hardening slate added the two pieces the C++ workflow needs:

  • AUTONOMY_PRELOAD_DLOPEN_AUDIT (#985) — opt-in forensics trail that emits one line per dlopen() / dlmopen() call to stderr with the actual libc outcome (loaded, load_failed, denied). This is the input to the allowlist-derivation workflow below.

  • dlmopen() sibling-bypass closure (#982) — the same AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST env gates both dlopen() and dlmopen(). C++ ROS 2 plugin loaders that use dlmopen() for per-component link-map namespace isolation (some pluginlib configurations, custom component containers) are covered without a separate config surface.

Workflow: baseline → allowlist → enforce

The derivation is a three-step loop. Run it once per workload; re-run after any plugin-set change.

Step 1 — Capture the load baseline

Run the workload without an allowlist set + with audit on. The shim passes every dlopen() / dlmopen() through (default- allow on unset allowlist; see Phase 2) but logs every call with the libc outcome.

autonomy ros2 run does not expose a generic env-passthrough flag, so the AUTONOMY_PRELOAD_DLOPEN_AUDIT env var has no canonical CLI flag — use docker run directly against the canonical adk-ros2-runtime image for the derivation loop:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch your-package your-launch.py \
    2>&1 | tee /tmp/dlopen-audit.log

Let the workload run through the operational phases you want to sandbox (component composition, plugin registration, the actual work, shutdown). Stop it cleanly so late-bound dlopen()s in shutdown handlers are captured too.

Why docker run and not autonomy ros2 run? The audit env var is opt-in observability without a dedicated CLI flag — the production hardening flags (--ld-preload, --dlopen-allowlist-from-bundle, --preload-exec-allowlist, --preload-connect-allowlist) cover the supported policy surface. The audit-derivation loop is a dev-time workflow; once the allowlist is stable, Step 3 promotes the result back to the canonical autonomy ros2 run surface.

Step 2 — Extract result=loaded paths

The audit lines tell you what actually came into memory. The allowlist derivation uses only result=loaded entries — never load_failed (those didn’t end up in memory, including them would be cargo-culting paths the workload doesn’t actually need):

grep '^autonomy-preload-audit: dlopen' /tmp/dlopen-audit.log \
  | grep 'result=loaded' \
  | sed -E 's/.*filename="([^"]+)".*/\1/' \
  | sort -u > /tmp/dlopen-allowlist.txt

# Also include dlmopen loads:
grep '^autonomy-preload-audit: dlmopen' /tmp/dlopen-audit.log \
  | grep 'result=loaded' \
  | sed -E 's/.*filename="([^"]+)".*/\1/' \
  | sort -u >> /tmp/dlopen-allowlist.txt

sort -u -o /tmp/dlopen-allowlist.txt /tmp/dlopen-allowlist.txt

Inspect the result — it should look like the operator’s mental model of the workload’s plugin surface. ROS 2 component container baselines typically run 30-80 entries; pure pluginlib nodes 10-30; plain C++ binaries with vendor SDKs 5-15.

Step 3 — Enforce + verify (dev iteration)

For the dev derivation loop, inject the allowlist via docker run with the audit trail still on so you can confirm every previously- loaded lib is still result=loaded:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(tr '\n' ':' < /tmp/dlopen-allowlist.txt) \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch your-package your-launch.py \
    2>&1 | tee /tmp/dlopen-enforce.log

Verify:

# Every previously-loaded lib is still result=loaded:
grep 'result=loaded' /tmp/dlopen-enforce.log | wc -l

# Zero shim denials — if the workload's plugin set is stable,
# the enforcement run should not surface any:
grep 'denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST' /tmp/dlopen-enforce.log

If denied by lines appear, the workload added a load between baseline + enforcement runs — either re-run the baseline capture or add the missing path. The autonomy-preload-audit: line for each denial says result=denied so you have the path verbatim.

Step 4 — Promote to bundle manifest (production)

Once the derived list is stable, fold it into the workload bundle’s manifest.json — schema v1.5 introduces the dlopen_allowlist block. The bundle directory layout the loader expects is <bundle-dir>/manifest.json + <bundle-dir>/policies/…; autonomy bundle pack then tars the directory into a single .tar for --dlopen-allowlist-from-bundle / --policy consumption (see bundle/manifest.go for the schema struct and cmd/autonomy/commands/dlopen_allowlist.go for the loader):

{
  "kind": "bundle",
  "schema_version": "1.5",
  "name": "your-workload",
  "version": "0.1.0",
  "channel": "dev",
  "min_adk_version": "1.0",
  "dlopen_allowlist": {
    "paths": [
      "/lib/x86_64-linux-gnu/libc.so.6",
      "/lib/x86_64-linux-gnu/libdl.so.2",
      "/opt/ros/jazzy/lib/librcl.so"
    ]
  }
}

Each path must be absolute and end in .so or .so.<digits> (.so.1, .so.42.3); globs are not supported in v1.5 (every entry is exact-match).

Production runs then use the canonical autonomy ros2 run surface — --dlopen-allowlist-from-bundle (explicit) or --policy (auto-source from the same v1.5 manifest, Phase 2b-3 #980):

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --policy ./your-bundle.tar \
    launch your-package your-launch.py

Reading the audit trail

result=

meaning

when investigating

loaded

allowlist passed AND libc returned a non-NULL handle — the .so is in memory

Expected; include in derived allowlist

load_failed

allowlist passed BUT libc returned NULL (bad ELF, missing transitive dep, namespace conflict, ABI drift)

Investigate — the workload tried to load but the file couldn’t be brought in; correlate by filename + timestamp with the workload’s own dlerror() stderr

denied

allowlist rejected; libc never called

Expected during enforcement runs only — appears alongside the canonical autonomy-preload: dlopen(...) denied by ... line

load_failed is the most operator-useful signal. The shim does not consume dlerror() (that would clear the error buffer and break the caller’s own diagnostic), so the workload’s own log line continues to carry the human-readable reason — correlate the audit line and the workload line by timestamp + filename to attribute the failure.

Common patterns to watch for

Workload shape

Typical audit entries

Notes

rclcpp_components container

30-80 .so from /opt/ros/<distro>/lib/, /usr/lib/<arch>/

Vendor packages (DDS implementations, message types) dominate

pluginlib-backed node

10-30 .so; mix of lmid=0 and lmid=-1

Custom pluginlib loaders may use dlmopen for ABI isolation

Plain C++ binary with vendor SDK

5-15 .so; vendor lib + its transitive deps

Vendor SDKs often dlopen their own optional modules at runtime

MAVLink / GStreamer / OpenCV

10-40 .so; many late-bound

Codec backends + protocol modules selected at runtime — capture across multiple operational phases

Other post-#960 5/5 considerations for C++ workloads

  • execve / execveat realpath canonicalization (#984). C++ workloads that spawn helper processes — system(), popen(), boost::process::system — now have their target paths realpath-canonicalized before the allowlist check. Operators who allowlist a path that resolves through a symlink (common for /usr/bin/python3/usr/bin/python3.X) must allowlist the canonical target. The audit-equivalent flow for exec is grep '^autonomy-preload: execve(' on a baseline run.

  • IPv6 + CIDR in the connect allowlist (#983). C++ workloads that use IPv6 (DDS over IPv6, MAVLink-over-UDP6, REST clients against IPv6 hosts) need explicit entries in AUTONOMY_PRELOAD_CONNECT_ALLOWLIST post-#960 — Phase 3’s v1 IPv6 pass-through is gone. Use the bracketed form [<ipv6>]:<port> or [<ipv6>/<prefix>]:<port> for CIDR.

Failure modes and recovery

ErrHardeningRequiresContainer

ros2: --seccomp-profile / --cap-drop / --read-only-rootfs / --tmpfs /
--ld-preload / --preload-exec-allowlist / --preload-connect-allowlist /
bundle-sourced dlopen_allowlist require the container execution path
(they apply via docker flags / env injection and have no analog on the
native subprocess path). The resolved mode is native, either because
--force-native was set or because Docker is unavailable.

Cause: A hardening flag was set but the dispatch resolved to native (no docker available + no --image, or --force-native). The flag has no effect on native — silent acceptance would be a “looks hardened, isn’t” trap.

Recovery: drop the flag for native dispatch, OR install docker AND pass --image <tag> for container dispatch.

--no-seccomp refused on native

--no-seccomp cannot be used with native dispatch — the native
subprocess path has no seccomp profile to opt out of, so no
warning fires and no WAL audit marker would be emitted to
record the opt-out (the flag's documented contract). Silently
accepting --no-seccomp on native would create a "looks audited,
isn't" trap.

Cause: Same shape as the previous, but for the --no-seccomp opt-out. The flag’s documented contract is “warn + WAL audit” — neither can fire on native.

Recovery: drop --no-seccomp (the native path is already running without seccomp), OR ensure docker + --image so the container path can apply the default-on profile AND honor the opt-out with the audit frame.

Workload exits with EPERM but you don’t know why

Run with -v style stderr capture + grep for the shim canonical prefix:

autonomy ros2 run ... 2>&1 | grep '^autonomy-preload:'

Every shim-mediated denial emits exactly one canonical line per denied call:

autonomy-preload: <op>(<arg>) denied by <env-var>

If no autonomy-preload: lines appear but the workload still EPERMs, the denial came from seccomp or cap-drop (kernel-layer) — check the kernel audit log:

sudo dmesg | tail -50
# or
sudo journalctl --since "5 minutes ago" | grep -i audit

Library not loading + dlopen_allowlist is set

If the workload loads a .so that’s not in the bundle’s dlopen_allowlist.paths, the shim denies. The workload sees dlopen() return NULL. Common symptom in ROS 2: a launched node fails with Failed to import: <plugin>.

Recovery: Add the missing path to the bundle’s dlopen_allowlist block, repackage, push the new bundle. For dev-loop iteration without re-packaging, bypass autonomy ros2 run and inject directly via docker run:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(cat bundle-paths.txt | tr '\n' ':') \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py

To audit what dlopens the workload performs on a known-good run, enable the audit trail (AUTONOMY_PRELOAD_DLOPEN_AUDIT; post-#960 #985) and parse result=loaded lines from stderr — see the C++ workloads section below for the full workflow.

Cross-references