ROS 2 Governed Bridge¶
Audience: operators turning on, observing, or recovering the long-lived
governed_ros2_bridge process — the runtime-owned C++ rclcpp bridge that
subscribes on a separate “agent” DDS domain, POSTs every message to the
AutonomyOps /v1/tool runtime for policy evaluation, and republishes
allowed messages on the “real” DDS domain. This is the per-message
counterpart to launch-level governance (the
ROS 2 Governance reference covers the
launch path).
The bridge is opt-in via --governed-bridge on either
autonomy ros2 run (paid) or autonomy run ros2.launch (CE); default
off preserves prior AutoRuntime behavior. This page tells you what to do
when you turn it on, what to look for, and how to get out of trouble.
Walking through the demo first? Start at ROS 2 Governed Bridge Quickstart; it runs
autonomy demo ros2-bridgeend-to-end with allow + deny evidence. The runbook below assumes you already have a workload to govern.
Prerequisites¶
dockeronPATH. The bridge runs in a container even when the workload runs natively —runtime/ros2bridge.BridgeProcessenforcesNetworkMode=hostandIPCMode=host, both of which are dispatched byruntime/execvia Docker.ghcr.io/autonomyops/adk-ros2-runtime:<version>present locally. Pull withdocker pull ghcr.io/autonomyops/adk-ros2-runtime:latest, or build from source viadocker build -t ghcr.io/autonomyops/adk-ros2-runtime:local -f demo/ros2-runtime/Dockerfile ..A policy bundle that allows the topics the bridge will republish. The embedded
embedded:ros2-bridge-demopolicy allows/cmd_vel(and/cmd_vel/*) and denies/disable_safety. Production fleets stage a custom bundle viaautonomy bundle pull <ref>and pass--policy <ref>to the launch command.
Procedure¶
1. Pick two ROS_DOMAIN_IDs¶
The bridge subscribes on one domain and republishes on another. They
must differ, or the bridge collapses into a loopback that defeats
governance entirely (BridgeProcess.Run returns ErrSameDomain
immediately).
Conventions used across the demo + docs:
Role |
Default |
Meaning |
|---|---|---|
|
|
Where the launched workload publishes — the “untrusted” side. Any DDS participant on this domain is intercepted. |
|
|
Where allowed messages get republished — the “real robot” side. Production subscribers (motor controller, perception stack) live here. |
If your fleet already uses a particular ROS_DOMAIN_ID for production
traffic, pin it to --real-domain and pick any unused integer in
0..101 for --agent-domain. The runbook below assumes 99 / 42.
2. Enable the bridge on the launch¶
You must pass --bridge-topics to tell the bridge which workload
topics to intercept. Without it the bridge falls back to a compiled-in
default (/agent_chat typed std_msgs/msg/String) and the workload’s
publishes on /cmd_vel / sensor topics / etc. are silently
ungoverned — the runner prints a stderr WARN on this combination,
but the launch still proceeds (the bridge’s fail-closed posture means
no agent publish reaches real ungoverned in that state, just none
reach real at all).
CE (no orchestrator required):
autonomy run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--governed-bridge \
--agent-domain 99 \
--real-domain 42 \
--bridge-topics '/cmd_vel:geometry_msgs/msg/Twist,/disable_safety:std_msgs/msg/Bool' \
ros2.launch launch demo_robot arm_demo.launch.py
Paid tier (same flags, paid-tier surface):
autonomy ros2 run \
--image ghcr.io/autonomyops/adk-ros2-runtime:latest \
--governed-bridge \
--agent-domain 99 \
--real-domain 42 \
--bridge-topics '/cmd_vel:geometry_msgs/msg/Twist,/disable_safety:std_msgs/msg/Bool' \
launch demo_robot arm_demo.launch.py
What happens, in order:
The runtime binds the in-process
/v1/toolserver to a random127.0.0.1:<port>. The URL is injected into both the bridge container and the launched workload container asAUTONOMY_RUNTIME_URL.The bridge container is spawned (
--network host --ipc host, subscribing onROS_DOMAIN_ID=99).The launch waits for the bridge to print
governed_ros2_bridge: ready agent_domain=99 real_domain=42on stdout. The readiness wait is bounded by--bridge-ready-timeout(default 30s); see Step 5 if it times out.The launched workload starts with
ROS_DOMAIN_ID=99and--ipc=hostinjected — so its publishes land on the bridge’s subscription domain and share/dev/shmwith the bridge container for FastDDS SHM transport.Every message the workload publishes on a bridged topic flows:
workload → bridge.subscribe → POST /v1/tool → policy → (allow) → bridge.republish → real domain.
3. Confirm the loop is closed¶
In a second terminal, inspect decision frames the bridge has emitted
so far (re-run after each publish — autonomy wal inspect reads the
WAL file end-to-end on each invocation; there is no streaming wal
subcommand):
# Preferred: first-class --bridge-only filter (#939 4-E.a).
autonomy wal inspect --kind autonomy.decision --bridge-only --json
# Equivalent jq form (still works; use this if you need richer projection).
autonomy wal inspect --kind autonomy.decision --json \
| jq 'select(.event.attrs.bridge_origin == "governed_ros2_bridge")'
You should see one frame per bridged publish, each carrying:
tool=tool.ros2.topic.publishoutcome=allow(ordenyif the policy rejected it)bridge_origin=governed_ros2_bridge(#939 4-E.a marker; absent on direct node POSTs, present on bridge-routed POSTs — pinned bybridgeOriginFromRequestinruntime/server.go)policy_refmatching the bundle’smanifest.policy_ref
If the marker is absent on bridge-routed POSTs, the bridge is not
in fact mediating the publish — the workload is publishing directly on
the real domain. Re-check that --agent-domain and --real-domain
differ and that the workload’s ROS_DOMAIN_ID env was actually
overridden (see Step 5).
4. Subscribe-side sanity check¶
On the real domain, confirm allowed messages are arriving:
ROS_DOMAIN_ID=42 ros2 topic echo /cmd_vel
Expected: one line per allow-decision in the WAL. If you see decisions but no echoes, the bridge is denying every message — check Step 6 and the policy.
Common operator situations¶
5. Troubleshooting the readiness gap¶
Symptom: the launch fails with
ros2: governed bridge did not signal ready within 30s
or
ros2: governed bridge exited before signaling ready: <wrapped error>
The first case is a soft timeout (the bridge is alive but slow); the second case is a hard exit (#940 fix — pre-fix this used to fall through to the soft timeout and silently corrupt the run). Both abort the launch — the runtime will not start the workload without a ready bridge.
Triage:
Was the image pull cold? Cold pulls of
adk-ros2-runtimeregularly exceed 30s on slow networks. Pre-pull:docker pull ghcr.io/autonomyops/adk-ros2-runtime:latest
then re-run. For pinned-bandwidth environments, raise the timeout on the launch:
--bridge-ready-timeout 2m.Is the binary actually in the image?
docker run --rm --entrypoint /bin/bash \ ghcr.io/autonomyops/adk-ros2-runtime:latest \ -c 'which governed_ros2_bridge && governed_ros2_bridge --version'
If the binary is missing, your local image was built before the
governed_ros2_bridgecolcon target was added todemo/ros2-runtime/Dockerfile. Rebuild:docker build -t ghcr.io/autonomyops/adk-ros2-runtime:local -f demo/ros2-runtime/Dockerfile . autonomy run --image ghcr.io/autonomyops/adk-ros2-runtime:local --governed-bridge ...
Is the bridge container exiting on a config error? Check the wrapped error in the abort message —
ErrSameDomainandErrRuntimeURLRequiredboth surface here.ErrSameDomainmeans you passed--agent-domain == --real-domain; pick different integers.
6. Recovering from a stuck bridge container¶
Symptom: the launch process has been killed (Ctrl-C, SIGKILL, host
reboot) but docker ps shows the bridge container still running. Or:
a fresh launch fails with a “port already in use” / “FastDDS already
bound” stderr line.
The bridge is spawned as docker run --rm, so a clean shutdown
removes the container. A killed launch process may leak it if Docker
didn’t get the SIGTERM cascade in time.
List bridge containers:
docker ps --filter ancestor=ghcr.io/autonomyops/adk-ros2-runtime:latest --format 'table {{.ID}}\t{{.Status}}\t{{.Command}}'
Stop with grace (lets the bridge flush its last decisions):
docker stop <container-id>
If
stophangs >10s, force-remove:docker rm -f <container-id>
Re-launch. The runtime starts a fresh bridge on a fresh
127.0.0.1:<random>port; no shared state with the prior process.
If your shell history shows the bridge launched with --keep, the
WAL directory under /tmp/autonomyops-demo-wal-* will still be on
disk — that’s intended (see Step 7
for how to read it).
7. Inspecting the WAL after the fact¶
Every bridge-mediated publish writes one autonomy.decision frame to
the WAL with the bridge_origin=governed_ros2_bridge marker. To pull
the per-run audit trail:
# Preferred: first-class flag, no jq needed (#939 4-E.a).
autonomy wal inspect --kind autonomy.decision --bridge-only --json
# Equivalent older form (still supported).
autonomy wal inspect --kind autonomy.decision --json \
| jq 'select(.event.attrs.bridge_origin == "governed_ros2_bridge")'
To distinguish bridge-routed decisions from direct-node POSTs (e.g. a
node inside the launched container that calls /v1/tool itself
without going through the bridge):
# Bridge-routed (first-class):
autonomy wal inspect --kind autonomy.decision --bridge-only --json \
| jq '{tool, outcome, attrs: .event.attrs}'
# Direct node-POSTs (no marker — invert via jq; the negative case is
# rarer than the positive case, so it stays in jq).
autonomy wal inspect --kind autonomy.decision --json \
| jq 'select(.event.attrs.tool == "tool.ros2.topic.publish" and .event.attrs.bridge_origin == null) | {tool, outcome, attrs: .event.attrs}'
The runtime sets bridge_origin only when the inbound POST’s
params._bridge_origin field is the canonical sentinel
governed_ros2_bridge AND the request kind is on the closed
BridgeRoutableKinds set (#941 fix — pre-fix the marker could be
spoofed by a node calling tool.echo with the marker in params).
8. Disabling the bridge cleanly¶
Just drop --governed-bridge from the launch. Without that flag the
runtime falls back to:
ExecBridge (the runtime is the publisher of every node-level POST), matching the prior
AutoRuntimebehavior unchanged.No bridge container is spawned, no
ROS_DOMAIN_IDinjection, noIPCMode=hoston the workload.
Bridge containers that were already running won’t be terminated — see Step 6.
Multi-topic + generic-type interception (#939 4-A)¶
The bridge accepts arbitrary DDS message types on any number of topics
in a single process via rclcpp::GenericSubscription +
rclcpp::GenericPublisher. Operator configuration:
--bridge-topics(CLI, on bothautonomy ros2 runandautonomy run) — comma-separatedtopic:typepairs OR repeated flags. The runner forwards these toRunOptions.BridgeTopics, whichruntime/ros2.defaultStartGovernedBridgesets onBridgeProcess.Topics, which becomes theGOVERNED_BRIDGE_TOPICSenv on the bridge container/native binary.GOVERNED_BRIDGE_TOPICSenv (direct, when invoking the bridge binary outside the runner — e.g. viadocker run) — same comma-separatedtopic:typepairs. Each entry creates one subscription on the agent domain + one publisher on the real domain, typed by the operator-supplied type.GOVERNED_BRIDGE_TOPIC(singular, back-compat) — one topic; the C++ side hard-defaults its type tostd_msgs/msg/String(the pre-4-A behavior). Preferred for single-topic legacy wiring; new callers should useGOVERNED_BRIDGE_TOPICSwith an explicit type.Neither set — falls back to
/agent_chat+std_msgs/msg/String, the compiled-in default.
Wire format addition. Every bridge-routed POST now carries
params.payload_b64 = base64 of the message’s serialized CDR bytes,
alongside params.type and the existing params.topic. The
params.data field is still emitted for std_msgs/msg/String only
(back-compat with the canonical wire-shape contract test); other
types ship the bytes via payload_b64 alone. The runtime currently
keys policy on topic + kind; field-level typed-policy via
rosidl_typesupport_introspection_cpp decoding of payload_b64
lands in a follow-up.
Native + container dual-path is validated. The same C++ source
compiles under both apt install ros-humble-ros-base natively and
the adk-ros2-runtime docker image build. The bridge can run either
way (Go-side: BridgeProcess.Image empty → native, set → container).
End-to-end subscribe → POST → republish is smoke-validated on both
paths against 3 different types.
Production hardening¶
Use the bridge for the topics you intend to govern, not for all of them. Direct-node POSTs to
/v1/tool(no bridge) are also governed and can carry typed envelopes — split your fleet’s topics between the two paths according to typed-policy needs.Stage the bridge behind a non-default policy bundle pinned via
--policy <ref>. The embeddedembedded:ros2-bridge-demopolicy is for demos.Layer SROS 2 / DDS-Security as defense-in-depth via
--bridge-keystore+--bridge-enclave+--workload-enclave(the three flags are all-or-nothing, enforced before any side effect). Provision the keystore withautonomy ros2 keystore init/mint/permissions. End-to-end procedure + bypass-resistance verification in the SROS 2 runbook and SROS 2 quickstart.Monitor the bridge container’s
stderrfor the rate-limited “POST failed” lines (#942 4-E.c — one line per topic per second, not per message). A sustained burst means the runtime listener died or the bridge can’t reach127.0.0.1; correlate withautonomy wal status.DDS interop note. Container-to-container DDS via
--network host --ipc hostworks reliably (it’s whatautonomy demo ros2-bridgeuses); host-process-to-container DDS can fail to deliver under FastDDS even with shared/dev/shmbecause of how FastDDS announces locator addresses. Keep publishers and the bridge on the same side of the container/native boundary in production.
Reference¶
runtime/ros2/runner.go—RunGoverned,RunOptions.GovernedBridge, and theErrGovernedBridgeNeedsAutoRuntime/ErrSameDomain/ErrBridgeExitedBeforeReady/ErrRuntimeURLRequiredsentinels.runtime/ros2bridge/bridge_process.go—BridgeProcess.Run, env contract,IPCMode=host/NetworkMode=hostwiring.runtime/ros2bridge/bridge.go—BridgeOriginRouter,BridgeRoutableKinds, the spoof-resistance contract.demo/ros2-runtime/ros2_ws/src/governed_ros2_bridge/— C++ rclcpp source.Tutorial walkthrough: ROS 2 Governed Bridge Quickstart.
Launch-level governance reference: ROS 2 Governance.