Tutorial — Container Hardening Hands-On

Objective: Walk through enabling each of the 5 hardening layers (--seccomp-profile, --cap-drop, --read-only-rootfs, LD_PRELOAD shim, --dlopen-allowlist-from-bundle) against a sample workload, observe what changes at each step, and run the hermetic bypass-resistance test that proves the layers compose.

Companion to: the Container Hardening runbook — this tutorial demonstrates the runbook’s procedures with concrete commands + expected output.

Time: 20–30 minutes hands-on plus the test run (~20 seconds).

Prerequisites:

  • A workstation with Docker on PATH (24+).

  • The autonomy CLI installed (bash scripts/install-ce.sh).

  • The adk-ros2-runtime image present:

    docker pull ghcr.io/autonomyops/adk-ros2-runtime:latest
    
  • A Go toolchain matching go.work (currently 1.25.11) — only needed for Step 6 (the bypass-resistance test).

Step 0 — Baseline (no hardening)

Run the demo workload with no hardening flags to confirm the baseline works:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --no-seccomp \
    launch demo_robot arm_demo.launch.py

(We pass --no-seccomp for this step ONLY to demonstrate the unhardened baseline. You’ll see a loud stderr warning telling you the workload spawned without the default-on starter profile — that warning is itself the safety net we’ll re-enable in Step 1.)

Expected: the workload spawns, the arm-controller publishes on /cmd_vel, and the launched node runs to completion. Hit Ctrl-C to stop.

Confirm what just happened:

WARNING: --no-seccomp is set — the workload will spawn without
the AutonomyOps starter seccomp profile. Container-escape primitives
(mount, setns, unshare, ptrace, etc.) are NOT denied at the kernel
layer. Use only for development or workloads that legitimately need
these syscalls. Audit event: seccomp-opt-out.

Operator log scrapers grep for Audit event: seccomp-opt-out to flag opt-outs. The seccomp-opt-out WAL marker is also recorded (viewable via autonomy wal inspect --kind PASS after the run).

Step 1 — Turn seccomp back on (default)

Drop the --no-seccomp flag — the CLI now applies the embedded starter profile by default:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    launch demo_robot arm_demo.launch.py

No new flag, no stderr warning. The workload spawns with the shipped 21-syscall deny-list applied via docker --security-opt seccomp=<embedded-starter-path>.

To prove the profile is on, look at the resolved docker spawn:

docker inspect $(docker ps --latest -q) \
  --format '{{json .HostConfig.SecurityOpt}}'
# ["seccomp=/tmp/autonomy-seccomp-starter-XXXXXXXX.json"]

The tempfile is materialized fresh per CLI invocation (the shipped profile is embedded in the binary via //go:embed — no sidecar JSON), and is removed when the CLI exits.

To prove it actually denies, in another terminal try to mount inside the workload’s container (after picking up the container id with docker ps):

docker exec -it <container> sh -c 'mount -t tmpfs none /mnt'
# mount: /mnt: permission denied

mount is in the starter’s deny-list. EPERM.

Step 2 — Drop unneeded Linux capabilities

The demo workload doesn’t need raw sockets or ptrace. Drop them:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    launch demo_robot arm_demo.launch.py

The CLI validates each capability name against a known-good list before rendering the docker spawn — typos fail loudly at the autonomy layer.

To prove the cap is dropped, in another terminal:

docker exec -it <container> sh -c 'cat /proc/self/status | grep CapBnd'
# CapBnd: 00000000a80c75fb
# (down from 00000000a80c75ff — bits for NET_RAW and SYS_PTRACE cleared)

A workload that calls socket(AF_PACKET, SOCK_RAW, ...) now sees EPERM.

Step 3 — Lock the rootfs read-only

The demo workload writes only to /tmp. Mount the rootfs RO and declare /tmp as the writable tmpfs:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    --read-only-rootfs --tmpfs /tmp \
    launch demo_robot arm_demo.launch.py

Verify in another terminal:

# Write outside the tmpfs:
docker exec -it <container> sh -c 'echo x > /etc/foo'
# sh: 1: cannot create /etc/foo: Read-only file system

# Write inside the tmpfs:
docker exec -it <container> sh -c 'echo x > /tmp/foo && cat /tmp/foo'
# x

/tmp is the explicit writable exception. Everything else returns EROFS.

Real-world picking of --tmpfs paths. A workload that writes elsewhere (cache dirs, runtime sockets, etc.) needs additional --tmpfs <path> declarations. Audit what your workload writes via strace -f -e openat ... 2>&1 | grep "O_WRONLY\|O_CREAT" in a known-good baseline run, then declare each path explicitly.

Step 4 — Bake in the LD_PRELOAD shim’s exec allowlist

The shim ships baked into adk-ros2-runtime:latest at /usr/local/lib/libautonomy_preload.so (custom images would need to bake it themselves — see runtime/preload/README.md). Turn it on + declare the allowlist:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    --read-only-rootfs --tmpfs /tmp \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3 \
    launch demo_robot arm_demo.launch.py

In another terminal, try to spawn a denied binary inside the workload:

docker exec -it <container> sh
# (sh itself is the denial test — sh isn't in the allowlist)

Expected: the docker exec fails — the shim’s execve() wrapper sees /bin/sh, doesn’t find it in the allowlist, denies with EPERM. On the workload’s stderr you’ll see:

autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST

The execve wrapper also covers execveat (the sibling syscall bypass — see the runbook’s Phase 4 for the bypass shape it closes).

Step 5 — Add the dlopen allowlist + connect allowlist

dlopen and connect are the same shape — env-driven allowlists the shim enforces. For dlopen, source from the workload bundle’s manifest:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --cap-drop NET_RAW,SYS_PTRACE \
    --read-only-rootfs --tmpfs /tmp \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3 \
    --dlopen-allowlist-from-bundle ./demo/bundles/ros2.tar \
    --preload-connect-allowlist 10.42.0.5:443 \
    launch demo_robot arm_demo.launch.py

Stderr lines you can grep for:

autonomy-preload: dlopen(/usr/lib/libnsl.so.1) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST
autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST

Loopback (127.0.0.0/8) is always allowed for connect — the /v1/tool policy callback lives there and would be bricked otherwise. The runbook documents this as a contract.

Step 6 — Run the bypass-resistance composition test

The hermetic integration test runtime/exec/composition_integration_test.go proves the 5 layers compose without conflict against a deliberately-malicious workload that attempts every shipped bypass.

From the repo root:

go test ./runtime/exec/ -run TestComposition_BypassResistance \
    -v -timeout 300s

Expected output (~20 seconds end-to-end on a warm-cache host):

=== RUN   TestComposition_BypassResistance_AllLayersEnforce
--- PASS: TestComposition_BypassResistance_AllLayersEnforce (20.09s)
PASS
ok      github.com/autonomyops/adk/runtime/exec    20.091s

If you add -test.v you’ll see the workload’s stderr captured:

WORKLOAD_START
autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST
VECTOR execve_unlisted: DENIED EPERM
autonomy-preload: dlopen(/etc/autonomy-shim-test/not-real.so) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST
VECTOR dlopen_unlisted: DENIED EPERM
autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST
VECTOR connect_unlisted: DENIED EPERM
VECTOR mount_seccomp: DENIED EPERM
VECTOR raw_socket_cap: DENIED EPERM
VECTOR root_write_rofs: DENIED EROFS
POSITIVE tmpfs_write: OK
POSITIVE loopback_passthrough: OK
WORKLOAD_END

Six attack vectors denied with the exact expected reason (EPERM from shim/seccomp/cap-drop, EROFS from rootfs-RO); two positive controls succeeded (proves the workload isn’t bricked); both liveness markers (WORKLOAD_START, WORKLOAD_END) present (proves the container actually ran).

What you proved

Layer

Vector

Denial mechanism

Stderr signal

Seccomp

mount syscall

kernel SCMP_ACT_ERRNO

EPERM (no shim log)

--cap-drop NET_RAW

socket(AF_PACKET, SOCK_RAW)

kernel CAP check

EPERM (no shim log)

--read-only-rootfs

write to /root/x

kernel rootfs RO

EROFS (no shim log)

Shim — execve

execve("/bin/sh")

shim wrapper

autonomy-preload: execve(...) denied + EPERM

Shim — dlopen

dlopen(libnsl.so.1)

shim wrapper

autonomy-preload: dlopen(...) denied + EPERM

Shim — connect

connect(8.8.8.8:53)

shim wrapper

autonomy-preload: connect(...) denied + EPERM

Each layer closes a class of bypass that the others can’t statically discriminate — that’s the defense-in-depth claim, and the composition test is its formal proof.

Step 7 — Derive a C++ dlopen allowlist from audit traffic

C++ workloads, especially ROS 2 rclcpp_components containers and pluginlib-backed nodes, can pull in dozens of .so files during composition. Writing the manifest’s dlopen_allowlist.paths by hand is a guess-then-debug loop. The post-#960 hardening 4/5 (#985) added an opt-in audit trail that surfaces every dlopen() / dlmopen() call with its actual libc outcome — use it to derive the allowlist from observed loads.

Capture a baseline

autonomy ros2 run doesn’t expose a generic env-passthrough flag, so the AUTONOMY_PRELOAD_DLOPEN_AUDIT env var (opt-in observability, no dedicated CLI flag) is set via docker run directly against the canonical adk-ros2-runtime image. The allowlist stays unset → the shim’s default-allow pass-through fires, but the audit trail logs every call:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py \
    2>&1 | tee /tmp/dlopen-audit.log

After the workload runs through its operational phases (you can exit cleanly with Ctrl-C once the arm has done a cycle), the log contains one line per filesystem load:

autonomy-preload-audit: dlopen(filename="/lib/x86_64-linux-gnu/libc.so.6", flags=0x1, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librcl.so", flags=0x101, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librclcpp.so", flags=0x101, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librcutils.so", flags=0x1, result=loaded)
autonomy-preload-audit: dlmopen(lmid=-1, filename="/opt/ros/jazzy/lib/libdemo_controller.so", flags=0x1, result=loaded)
...

Build the allowlist

result=loaded is the only state worth including — load_failed entries didn’t end up in memory, and denied only appears in enforcement runs (not baseline). The runbook’s C++ workload section has the same one-liner with more context; the short form:

{
  grep '^autonomy-preload-audit: dlopen' /tmp/dlopen-audit.log
  grep '^autonomy-preload-audit: dlmopen' /tmp/dlopen-audit.log
} | grep 'result=loaded' \
  | sed -E 's/.*filename="([^"]+)".*/\1/' \
  | sort -u > /tmp/dlopen-allowlist.txt

wc -l /tmp/dlopen-allowlist.txt
# 47 /tmp/dlopen-allowlist.txt   ← typical rclcpp_components baseline

Inspect a sample:

head -10 /tmp/dlopen-allowlist.txt
# /lib/x86_64-linux-gnu/libc.so.6
# /lib/x86_64-linux-gnu/libdl.so.2
# /lib/x86_64-linux-gnu/libpthread.so.0
# /opt/ros/jazzy/lib/libdemo_controller.so
# /opt/ros/jazzy/lib/librcl.so
# /opt/ros/jazzy/lib/librclcpp.so
# ...

Enforce + verify (dev iteration)

For the dev derivation loop, inject the allowlist via docker run with the audit trail still on so you can confirm every previously- loaded lib is still result=loaded:

docker run --rm \
    -e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
    -e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
    -e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(tr '\n' ':' < /tmp/dlopen-allowlist.txt) \
    ghcr.io/autonomyops/adk-ros2-runtime:latest \
    ros2 launch demo_robot arm_demo.launch.py \
    2>&1 | tee /tmp/dlopen-enforce.log

Verify:

# Every load that succeeded in the baseline still says result=loaded:
grep 'result=loaded' /tmp/dlopen-enforce.log | wc -l
# 47   ← same as baseline (zero regressions)

# Zero shim denials — the allowlist covers the workload's
# stable plugin set:
grep 'denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST' /tmp/dlopen-enforce.log
# (no output)

If denials appear, the workload added a load between the baseline + enforcement runs (late-bound plugin, new code path). The audit line for each denial says result=denied so you have the path verbatim — append it to the allowlist and re-run.

Promote to the bundle manifest (production)

Once the derived list is stable across a few runs, move it into the bundle’s manifest.json under the dlopen_allowlist.paths key (schema v1.5 introduces this block — see bundle/manifest.go for the schema struct). The bundle layout the loader expects is <bundle-dir>/manifest.json + <bundle-dir>/policies/…; autonomy bundle pack then tars the directory into the .tar that --dlopen-allowlist-from-bundle consumes:

{
  "kind": "bundle",
  "schema_version": "1.5",
  "name": "demo_robot",
  "version": "0.1.0",
  "channel": "dev",
  "min_adk_version": "1.0",
  "dlopen_allowlist": {
    "paths": [
      "/lib/x86_64-linux-gnu/libc.so.6",
      "/lib/x86_64-linux-gnu/libdl.so.2",
      "/lib/x86_64-linux-gnu/libpthread.so.0",
      "/opt/ros/jazzy/lib/libdemo_controller.so",
      "/opt/ros/jazzy/lib/librcl.so",
      "/opt/ros/jazzy/lib/librclcpp.so"
    ]
  }
}

Each path must be absolute and end in .so or .so.<digits>; v1.5 doesn’t support globs (every entry is exact-match) — the audit-derivation loop above produces exact paths by construction so this is a no-op constraint for the workflow.

Re-bundle (autonomy bundle pack) and switch to the canonical production surface — the audit env var stays off (no derivation needed in prod), the allowlist sources from the manifest:

autonomy ros2 run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --ld-preload /usr/local/lib/libautonomy_preload.so \
    --dlopen-allowlist-from-bundle ./your-bundle.tar \
    launch demo_robot arm_demo.launch.py

Or equivalently with --policy <bundle> (Phase 2b-3, #980) which auto- sources the allowlist from the same v1.5 manifest the policy gate already loaded.

The runbook’s C++ workload section has the full operational reference: reading the audit trail, common patterns by workload shape (rclcpp_components, pluginlib, vendor SDKs), and dlmopen link-map namespace notes.

Cross-references

  • Container Hardening runbook — operator-facing reference for the same layers, with failure modes + recovery procedures.

  • runtime/preload/README.md — shim wrapper contracts + the 3 paths to get the shim into your own image (canonical adk-ros2-runtime, release-asset download, build from source).

  • runtime/seccomp/README.md — starter profile rationale + how to derive a workload-specific custom profile.

  • Issue #960 — the epic that filed this layer’s requirements + acceptance criteria.