Tutorial — Container Hardening Hands-On¶
Objective: Walk through enabling each of the 5 hardening
layers (--seccomp-profile, --cap-drop, --read-only-rootfs,
LD_PRELOAD shim, --dlopen-allowlist-from-bundle) against a
sample workload, observe what changes at each step, and run the
hermetic bypass-resistance test that proves the layers compose.
Companion to: the Container Hardening runbook — this tutorial demonstrates the runbook’s procedures with concrete commands + expected output.
Time: 20–30 minutes hands-on plus the test run (~20 seconds).
Prerequisites:
A workstation with Docker on
PATH(24+).The
autonomyCLI installed (bash scripts/install-ce.sh).The
adk-ros2-runtimeimage present:docker pull ghcr.io/autonomyops/adk-ros2-runtime:latest
A Go toolchain matching
go.work(currently1.25.11) — only needed for Step 6 (the bypass-resistance test).
Step 0 — Baseline (no hardening)¶
Run the demo workload with no hardening flags to confirm the baseline works:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--no-seccomp \
launch demo_robot arm_demo.launch.py
(We pass --no-seccomp for this step ONLY to demonstrate the
unhardened baseline. You’ll see a loud stderr warning telling you
the workload spawned without the default-on starter profile — that
warning is itself the safety net we’ll re-enable in Step 1.)
Expected: the workload spawns, the arm-controller publishes on
/cmd_vel, and the launched node runs to completion. Hit Ctrl-C
to stop.
Confirm what just happened:
WARNING: --no-seccomp is set — the workload will spawn without
the AutonomyOps starter seccomp profile. Container-escape primitives
(mount, setns, unshare, ptrace, etc.) are NOT denied at the kernel
layer. Use only for development or workloads that legitimately need
these syscalls. Audit event: seccomp-opt-out.
Operator log scrapers grep for Audit event: seccomp-opt-out to
flag opt-outs. The seccomp-opt-out WAL marker is also recorded
(viewable via autonomy wal inspect --kind PASS after the run).
Step 1 — Turn seccomp back on (default)¶
Drop the --no-seccomp flag — the CLI now applies the embedded
starter profile by default:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
launch demo_robot arm_demo.launch.py
No new flag, no stderr warning. The workload spawns with the
shipped 21-syscall deny-list applied via docker
--security-opt seccomp=<embedded-starter-path>.
To prove the profile is on, look at the resolved docker spawn:
docker inspect $(docker ps --latest -q) \
--format '{{json .HostConfig.SecurityOpt}}'
# ["seccomp=/tmp/autonomy-seccomp-starter-XXXXXXXX.json"]
The tempfile is materialized fresh per CLI invocation (the
shipped profile is embedded in the binary via //go:embed — no
sidecar JSON), and is removed when the CLI exits.
To prove it actually denies, in another terminal try to mount
inside the workload’s container (after picking up the container
id with docker ps):
docker exec -it <container> sh -c 'mount -t tmpfs none /mnt'
# mount: /mnt: permission denied
mount is in the starter’s deny-list. EPERM.
Step 2 — Drop unneeded Linux capabilities¶
The demo workload doesn’t need raw sockets or ptrace. Drop them:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--cap-drop NET_RAW,SYS_PTRACE \
launch demo_robot arm_demo.launch.py
The CLI validates each capability name against a known-good list before rendering the docker spawn — typos fail loudly at the autonomy layer.
To prove the cap is dropped, in another terminal:
docker exec -it <container> sh -c 'cat /proc/self/status | grep CapBnd'
# CapBnd: 00000000a80c75fb
# (down from 00000000a80c75ff — bits for NET_RAW and SYS_PTRACE cleared)
A workload that calls socket(AF_PACKET, SOCK_RAW, ...) now sees
EPERM.
Step 3 — Lock the rootfs read-only¶
The demo workload writes only to /tmp. Mount the rootfs RO and
declare /tmp as the writable tmpfs:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--cap-drop NET_RAW,SYS_PTRACE \
--read-only-rootfs --tmpfs /tmp \
launch demo_robot arm_demo.launch.py
Verify in another terminal:
# Write outside the tmpfs:
docker exec -it <container> sh -c 'echo x > /etc/foo'
# sh: 1: cannot create /etc/foo: Read-only file system
# Write inside the tmpfs:
docker exec -it <container> sh -c 'echo x > /tmp/foo && cat /tmp/foo'
# x
/tmp is the explicit writable exception. Everything else returns
EROFS.
Real-world picking of
--tmpfspaths. A workload that writes elsewhere (cache dirs, runtime sockets, etc.) needs additional--tmpfs <path>declarations. Audit what your workload writes viastrace -f -e openat ... 2>&1 | grep "O_WRONLY\|O_CREAT"in a known-good baseline run, then declare each path explicitly.
Step 4 — Bake in the LD_PRELOAD shim’s exec allowlist¶
The shim ships baked into adk-ros2-runtime:latest at
/usr/local/lib/libautonomy_preload.so (custom images would need
to bake it themselves — see
runtime/preload/README.md).
Turn it on + declare the allowlist:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--cap-drop NET_RAW,SYS_PTRACE \
--read-only-rootfs --tmpfs /tmp \
--ld-preload /usr/local/lib/libautonomy_preload.so \
--preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3 \
launch demo_robot arm_demo.launch.py
In another terminal, try to spawn a denied binary inside the workload:
docker exec -it <container> sh
# (sh itself is the denial test — sh isn't in the allowlist)
Expected: the docker exec fails — the shim’s execve() wrapper
sees /bin/sh, doesn’t find it in the allowlist, denies with
EPERM. On the workload’s stderr you’ll see:
autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST
The
execvewrapper also coversexecveat(the sibling syscall bypass — see the runbook’s Phase 4 for the bypass shape it closes).
Step 5 — Add the dlopen allowlist + connect allowlist¶
dlopen and connect are the same shape — env-driven allowlists
the shim enforces. For dlopen, source from the workload bundle’s
manifest:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--cap-drop NET_RAW,SYS_PTRACE \
--read-only-rootfs --tmpfs /tmp \
--ld-preload /usr/local/lib/libautonomy_preload.so \
--preload-exec-allowlist /usr/bin/ros2,/usr/bin/python3 \
--dlopen-allowlist-from-bundle ./demo/bundles/ros2.tar \
--preload-connect-allowlist 10.42.0.5:443 \
launch demo_robot arm_demo.launch.py
Stderr lines you can grep for:
autonomy-preload: dlopen(/usr/lib/libnsl.so.1) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST
autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST
Loopback (
127.0.0.0/8) is always allowed forconnect— the/v1/toolpolicy callback lives there and would be bricked otherwise. The runbook documents this as a contract.
Step 6 — Run the bypass-resistance composition test¶
The hermetic integration test
runtime/exec/composition_integration_test.go
proves the 5 layers compose without conflict against a
deliberately-malicious workload that attempts every shipped
bypass.
From the repo root:
go test ./runtime/exec/ -run TestComposition_BypassResistance \
-v -timeout 300s
Expected output (~20 seconds end-to-end on a warm-cache host):
=== RUN TestComposition_BypassResistance_AllLayersEnforce
--- PASS: TestComposition_BypassResistance_AllLayersEnforce (20.09s)
PASS
ok github.com/autonomyops/adk/runtime/exec 20.091s
If you add -test.v you’ll see the workload’s stderr captured:
WORKLOAD_START
autonomy-preload: execve(/bin/sh) denied by AUTONOMY_PRELOAD_EXEC_ALLOWLIST
VECTOR execve_unlisted: DENIED EPERM
autonomy-preload: dlopen(/etc/autonomy-shim-test/not-real.so) denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST
VECTOR dlopen_unlisted: DENIED EPERM
autonomy-preload: connect(8.8.8.8:53) denied by AUTONOMY_PRELOAD_CONNECT_ALLOWLIST
VECTOR connect_unlisted: DENIED EPERM
VECTOR mount_seccomp: DENIED EPERM
VECTOR raw_socket_cap: DENIED EPERM
VECTOR root_write_rofs: DENIED EROFS
POSITIVE tmpfs_write: OK
POSITIVE loopback_passthrough: OK
WORKLOAD_END
Six attack vectors denied with the exact expected reason
(EPERM from shim/seccomp/cap-drop, EROFS from rootfs-RO); two
positive controls succeeded (proves the workload isn’t bricked);
both liveness markers (WORKLOAD_START, WORKLOAD_END) present
(proves the container actually ran).
What you proved¶
Layer |
Vector |
Denial mechanism |
Stderr signal |
|---|---|---|---|
Seccomp |
|
kernel |
|
|
|
kernel CAP check |
|
|
write to |
kernel rootfs RO |
|
Shim — execve |
|
shim wrapper |
|
Shim — dlopen |
|
shim wrapper |
|
Shim — connect |
|
shim wrapper |
|
Each layer closes a class of bypass that the others can’t statically discriminate — that’s the defense-in-depth claim, and the composition test is its formal proof.
Step 7 — Derive a C++ dlopen allowlist from audit traffic¶
C++ workloads, especially ROS 2 rclcpp_components containers
and pluginlib-backed nodes, can pull in dozens of .so files
during composition. Writing the manifest’s
dlopen_allowlist.paths by hand is a guess-then-debug loop.
The post-#960 hardening 4/5
(#985) added an
opt-in audit trail that surfaces every dlopen() /
dlmopen() call with its actual libc outcome — use it to
derive the allowlist from observed loads.
Capture a baseline¶
autonomy ros2 run doesn’t expose a generic env-passthrough
flag, so the AUTONOMY_PRELOAD_DLOPEN_AUDIT env var (opt-in
observability, no dedicated CLI flag) is set via docker run
directly against the canonical adk-ros2-runtime image. The
allowlist stays unset → the shim’s default-allow pass-through
fires, but the audit trail logs every call:
docker run --rm \
-e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
-e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
ghcr.io/autonomyops/adk-ros2-runtime:latest \
ros2 launch demo_robot arm_demo.launch.py \
2>&1 | tee /tmp/dlopen-audit.log
After the workload runs through its operational phases (you can
exit cleanly with Ctrl-C once the arm has done a cycle), the
log contains one line per filesystem load:
autonomy-preload-audit: dlopen(filename="/lib/x86_64-linux-gnu/libc.so.6", flags=0x1, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librcl.so", flags=0x101, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librclcpp.so", flags=0x101, result=loaded)
autonomy-preload-audit: dlopen(filename="/opt/ros/jazzy/lib/librcutils.so", flags=0x1, result=loaded)
autonomy-preload-audit: dlmopen(lmid=-1, filename="/opt/ros/jazzy/lib/libdemo_controller.so", flags=0x1, result=loaded)
...
Build the allowlist¶
result=loaded is the only state worth including — load_failed
entries didn’t end up in memory, and denied only appears in
enforcement runs (not baseline). The
runbook’s C++ workload section
has the same one-liner with more context; the short form:
{
grep '^autonomy-preload-audit: dlopen' /tmp/dlopen-audit.log
grep '^autonomy-preload-audit: dlmopen' /tmp/dlopen-audit.log
} | grep 'result=loaded' \
| sed -E 's/.*filename="([^"]+)".*/\1/' \
| sort -u > /tmp/dlopen-allowlist.txt
wc -l /tmp/dlopen-allowlist.txt
# 47 /tmp/dlopen-allowlist.txt ← typical rclcpp_components baseline
Inspect a sample:
head -10 /tmp/dlopen-allowlist.txt
# /lib/x86_64-linux-gnu/libc.so.6
# /lib/x86_64-linux-gnu/libdl.so.2
# /lib/x86_64-linux-gnu/libpthread.so.0
# /opt/ros/jazzy/lib/libdemo_controller.so
# /opt/ros/jazzy/lib/librcl.so
# /opt/ros/jazzy/lib/librclcpp.so
# ...
Enforce + verify (dev iteration)¶
For the dev derivation loop, inject the allowlist via docker run
with the audit trail still on so you can confirm every previously-
loaded lib is still result=loaded:
docker run --rm \
-e LD_PRELOAD=/usr/local/lib/libautonomy_preload.so \
-e AUTONOMY_PRELOAD_DLOPEN_AUDIT=1 \
-e AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST=$(tr '\n' ':' < /tmp/dlopen-allowlist.txt) \
ghcr.io/autonomyops/adk-ros2-runtime:latest \
ros2 launch demo_robot arm_demo.launch.py \
2>&1 | tee /tmp/dlopen-enforce.log
Verify:
# Every load that succeeded in the baseline still says result=loaded:
grep 'result=loaded' /tmp/dlopen-enforce.log | wc -l
# 47 ← same as baseline (zero regressions)
# Zero shim denials — the allowlist covers the workload's
# stable plugin set:
grep 'denied by AUTONOMY_PRELOAD_DLOPEN_ALLOWLIST' /tmp/dlopen-enforce.log
# (no output)
If denials appear, the workload added a load between the
baseline + enforcement runs (late-bound plugin, new code path).
The audit line for each denial says result=denied so you have
the path verbatim — append it to the allowlist and re-run.
Promote to the bundle manifest (production)¶
Once the derived list is stable across a few runs, move it into
the bundle’s manifest.json under the dlopen_allowlist.paths
key (schema v1.5 introduces this block — see bundle/manifest.go
for the schema struct). The bundle layout the loader expects is
<bundle-dir>/manifest.json + <bundle-dir>/policies/…;
autonomy bundle pack then tars the directory into the .tar
that --dlopen-allowlist-from-bundle consumes:
{
"kind": "bundle",
"schema_version": "1.5",
"name": "demo_robot",
"version": "0.1.0",
"channel": "dev",
"min_adk_version": "1.0",
"dlopen_allowlist": {
"paths": [
"/lib/x86_64-linux-gnu/libc.so.6",
"/lib/x86_64-linux-gnu/libdl.so.2",
"/lib/x86_64-linux-gnu/libpthread.so.0",
"/opt/ros/jazzy/lib/libdemo_controller.so",
"/opt/ros/jazzy/lib/librcl.so",
"/opt/ros/jazzy/lib/librclcpp.so"
]
}
}
Each path must be absolute and end in .so or .so.<digits>;
v1.5 doesn’t support globs (every entry is exact-match) — the
audit-derivation loop above produces exact paths by construction
so this is a no-op constraint for the workflow.
Re-bundle (autonomy bundle pack) and switch to the canonical
production surface — the audit env var stays off (no derivation
needed in prod), the allowlist sources from the manifest:
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--ld-preload /usr/local/lib/libautonomy_preload.so \
--dlopen-allowlist-from-bundle ./your-bundle.tar \
launch demo_robot arm_demo.launch.py
Or equivalently with --policy <bundle> (Phase 2b-3,
#980) which auto-
sources the allowlist from the same v1.5 manifest the policy
gate already loaded.
The runbook’s C++ workload section has the full operational reference: reading the audit trail, common patterns by workload shape (
rclcpp_components,pluginlib, vendor SDKs), anddlmopenlink-map namespace notes.
Cross-references¶
Container Hardening runbook — operator-facing reference for the same layers, with failure modes + recovery procedures.
runtime/preload/README.md— shim wrapper contracts + the 3 paths to get the shim into your own image (canonical adk-ros2-runtime, release-asset download, build from source).runtime/seccomp/README.md— starter profile rationale + how to derive a workload-specific custom profile.Issue #960 — the epic that filed this layer’s requirements + acceptance criteria.