ROS 2 SROS 2 / DDS-Security Quickstart

This tutorial walks through the full SROS 2 provisioning + launch flow that layers DDS-Security on top of the governed bridge: autonomy ros2 keystore init/mint/permissions, the bridge under Strategy=Enforce, and the in-tree regression test that proves the load-bearing bypass-resistance claim. By the end you’ll see the secured bridge start cleanly on a dual-domain setup AND see a rogue uncredentialed publisher get rejected at the DDS-Security discovery layer.

Operator-facing runbook for production: ROS 2 SROS 2 / DDS-Security runbook. This tutorial assumes you’ve already done the Governed Bridge Quickstart — SROS 2 is defense-in-depth on the bridge, not a standalone substitute.

What this proves

  1. Identity-gated participation. Every DDS participant on the secured domains presents a per-identity certificate that chains to the keystore CA. rcl + DDS-Security reject participants without a valid cert at discovery time — before any message is exchanged.

  2. Signed permissions are load-bearing. Each enclave’s permissions.xml (signed via PKCS#7) declares which topics the identity may publish/subscribe on which domains. A participant without a matching grant is rejected at check_create_datawriter / check_create_datareader.

  3. Strategy=Enforce makes rejections terminal, not advisory. Permissive would log+allow; Enforce is the default the runner pins.

  4. Bypass-resistance (the load-bearing security claim of #938): even a participant on the right ROS_DOMAIN_ID cannot publish to a secured subscriber without keystore credentials. Pinned in CI by TestBypassResistance_RogueCannotPublishToSecuredSubscriber.

Prerequisites

  • ros-humble-ros-base installed on the operator’s host (provides ros2 security create_*). Keystore provisioning is a host operation.

  • openssl on PATH (for the multi-domain governance/permissions re-sign path). Ships standard on every Linux install.

  • autonomy binary on PATH.

  • Familiarity with the application-layer bridge: agent vs real domain, --governed-bridge, the policy bundle. The Governed Bridge Quickstart covers this; do that one first.

Step 1 — Provision the keystore

Three commands cover the whole keystore. Order matters: init creates the CA, mint creates per-identity enclaves under the CA, permissions generates the signed permissions.xml per enclave.

KEYSTORE=/tmp/sros2-quickstart-ks
mkdir -p "$KEYSTORE"

# 1. Keystore + governance.xml covering the bridge's TWO domains.
#    --domain rewrites the default sros2 governance.xml (which only
#    covers domain 0) to cover both 42 + 99 and re-signs the .p7s
#    via openssl. Without this, the bridge participant gets rejected
#    with "Could not find domain 42 in governance (code: 141)".
autonomy ros2 keystore init "$KEYSTORE" --domain 42 --domain 99

# 2. Mint enclaves. The bridge has ONE enclave that covers both of its
#    rclcpp::Contexts (via ROS_SECURITY_ENCLAVE_OVERRIDE). The workload
#    has a DELIBERATELY SEPARATE enclave so a compromised workload
#    can't impersonate the bridge to publish on real.
autonomy ros2 keystore mint --keystore "$KEYSTORE" /governed_ros2_bridge_real
autonomy ros2 keystore mint --keystore "$KEYSTORE" /demo_robot/arm_controller

# 3. Synthesize permissions XML per enclave.
#    Bridge: BOTH domains, BOTH directions (subs on agent, pubs on real).
autonomy ros2 keystore permissions /governed_ros2_bridge_real \
    --keystore "$KEYSTORE" \
    --domain 42 --domain 99 \
    --publish  /cmd_vel,/cmd_vel/* \
    --subscribe /cmd_vel,/cmd_vel/*
#    Workload: agent domain only, narrow topic surface.
autonomy ros2 keystore permissions /demo_robot/arm_controller \
    --keystore "$KEYSTORE" \
    --domain 99 \
    --publish  /cmd_vel \
    --subscribe /cmd_vel

Preferred when the bundle is the source of truth (#938 3-C.1): if the bundle’s manifest.json is at schema_version 1.4+ and carries a ros2_topics:{publish,subscribe} block (the demo ros2-bridge bundle does), point --from-bundle at it instead of re-typing the lists. The command reads the topic surface out of the manifest, so the permissions stay in sync with whatever the bundle declares:

# Same bridge permissions as above, resolved from the demo bundle.
autonomy ros2 keystore permissions /governed_ros2_bridge_real \
    --keystore "$KEYSTORE" \
    --domain 42 --domain 99 \
    --from-bundle demo/bundles/ros2-bridge.tar

--from-bundle is mutually exclusive with --publish/--subscribe. It accepts either a .tar file or a directory containing manifest.json. See the runbook (../runbooks/ros2-sros2-bridge.md) for the bundle-manifest schema details.

Expected output (all three commands print ok on success):

ros2 keystore init: rewrote governance.xml to cover domains [42 99] + re-signed governance.p7s
ros2 keystore init: ok — keystore root at /tmp/sros2-quickstart-ks
...
ros2 keystore mint: ok — enclave at /tmp/sros2-quickstart-ks/enclaves/governed_ros2_bridge_real
...
ros2 keystore permissions: ok — wrote
  /tmp/sros2-quickstart-ks/enclaves/governed_ros2_bridge_real/permissions.xml
  /tmp/sros2-quickstart-ks/enclaves/governed_ros2_bridge_real/permissions.p7s

Step 2 — Verify the keystore structure

$ ls "$KEYSTORE/enclaves"
governance.p7s   governance.xml   governed_ros2_bridge_real   demo_robot
$ grep '<id>' "$KEYSTORE/enclaves/governance.xml"
              <id>42</id>
              <id>99</id>
$ grep '<id>' "$KEYSTORE/enclaves/governed_ros2_bridge_real/permissions.xml" | sort -u
          <id>42</id>
          <id>99</id>
$ openssl smime -verify \
    -in "$KEYSTORE/enclaves/governance.p7s" \
    -CAfile "$KEYSTORE/public/permissions_ca.cert.pem" 2>&1 | tail -1
Verification successful

Both domain IDs in governance.xml, both in the bridge enclave’s permissions.xml, signature verifies against the keystore CA.

Step 3 — Launch the bridge with SROS 2 wired

autonomy run \
    --image ghcr.io/autonomyops/adk-ros2-runtime:latest \
    --governed-bridge \
    --agent-domain 99 --real-domain 42 \
    --bridge-topics '/cmd_vel:std_msgs/msg/String' \
    --bridge-keystore "$KEYSTORE" \
    --bridge-enclave  /governed_ros2_bridge_real \
    --workload-enclave /demo_robot/arm_controller \
    ros2.launch launch demo_robot arm_demo.launch.py

Expected — in the bridge container’s logs (watch with docker logs -f):

[INFO] [rcl]: Found security directory: /var/lib/.../enclaves/governed_ros2_bridge_real
[INFO] [rcl]: Found security directory: /var/lib/.../enclaves/governed_ros2_bridge_real
governed_ros2_bridge: ready  agent_domain=99  real_domain=42  topics=/cmd_vel:std_msgs/msg/String  runtime_url=...

Two “Found security directory” lines (one per rclcpp::Context: agent + real) are normal. The ready line means both contexts:

  1. Found the keystore via the bind-mount + ROS_SECURITY_KEYSTORE env

  2. Loaded permissions.xml + verified the signature

  3. Created their participant on the correct DDS domain (allowed by governance.xml)

  4. Created the rt/cmd_vel subscription (agent) + publisher (real) — both gated by permissions.xml

From here, the application-layer governance loop runs exactly as it does without SROS 2 — bridge intercepts each message, POSTs to /v1/tool, policy decides, republishes on allow. SROS 2 has only added one extra condition: every participant must be credentialed.

Step 4 — Verify the bypass-resistance claim (in-tree regression test)

The most rigorous validation is the in-tree Go test that runs the full keystore + bridge + rogue publisher end-to-end:

source /opt/ros/humble/setup.bash
cd <repo-root>
go test ./cmd/autonomy/commands/... \
    -run TestBypassResistance_RogueCannotPublishToSecuredSubscriber -v

Expected:

=== RUN   TestBypassResistance_RogueCannotPublishToSecuredSubscriber
[INFO] [rcl]: Found security directory: /tmp/.../enclaves/n
publishing #1: std_msgs.msg.String(data='GOLDEN-LEGIT-MUST-BE-DELIVERED')
publishing #2: std_msgs.msg.String(data='GOLDEN-LEGIT-MUST-BE-DELIVERED')
publishing #3: std_msgs.msg.String(data='GOLDEN-LEGIT-MUST-BE-DELIVERED')
Waiting for at least 1 matching subscription(s)...
Waiting for at least 1 matching subscription(s)...
Waiting for at least 1 matching subscription(s)...
--- PASS: TestBypassResistance_RogueCannotPublishToSecuredSubscriber (10.09s)

Reading the trace:

  • CREDENTIALED publisher (positive control) published 3× successfully; the secured subscriber received GOLDEN-LEGIT-MUST-BE-DELIVERED — proves the secured side is a functioning DDS-Security participant.

  • ROGUE publisher (no ROS_SECURITY_* env) saw “Waiting for at least 1 matching subscription(s)” the whole time — DDS-Security at the secured side rejected its discovery announcements; it never matched, never delivered. The secured subscriber’s log contains zero rogue payloads.

  • Three liveness gates (signal(0) probes at phase boundaries) verified the secured subscriber stayed alive throughout — without this, “no rogue payload” could mean “dead subscriber” (the false-pass mode #953 reviewer caught and #953-fix closed).

If this test passes on your host, your autonomy ros2 keystore pipeline produces a setup the runtime actually rejects rogue traffic against — not just one that passes its own structural assertions.

Troubleshooting

Symptom

Likely cause

Fix

Could not find domain X in governance

governance.xml only covers other domains

Re-run autonomy ros2 keystore init --domain X (idempotent on the CA)

Not found a rule allowing to use the domain_id

permissions.xml doesn’t cover the participant’s domain

Re-run autonomy ros2 keystore permissions <enclave> --domain X

rt/<topic> topic not found in allow rule

permissions.xml doesn’t grant pub/sub on that topic

Re-run permissions with --publish <topic> and/or --subscribe <topic> (bridge needs BOTH directions)

participant denied by default rule (code: 145)

Enclave name mismatch between cert CN and grant

Verify --bridge-enclave matches the name passed to keystore mint exactly

ErrSecurityIncomplete

Half-configured triple

Pass all of --bridge-keystore + --bridge-enclave + --workload-enclave, or none

ErrSecurityNeedsGovernedBridge

SROS 2 flags without --governed-bridge

Add --governed-bridge; SROS 2 is defense-in-depth ON the bridge, not standalone

xmlrpc.client.Fault: !rclpy.ok() from ros2 topic echo

Separate Python rclpy SROS 2 issue not affecting the bridge

Use C++ subscriber or bridge log directly; bridge itself is unaffected

See the full troubleshooting section in the runbook for the operator-under-fire version of these.

What this tutorial does NOT cover

  • Identity rotation / revocation. SROS 2 doesn’t ship a built-in revocation flow; a compromised key requires reissuing certs across the fleet. Out of scope for this quickstart.

  • Bundle-driven permissions synthesis (auto-extracting the topic list from the policy bundle’s tool.ros2.* Rego rules). Filed as #938 3-C.1 follow-up; until it lands, operators supply the --publish / --subscribe lists manually matching their bundle.

What’s next

  • SROS 2 runbook — operator-under-fire procedures for the production launch, recovery, and rollout flows.

  • Governed Bridge runbook — the application-layer policy mediation SROS 2 layers on top of.

  • The autonomy ros2 keystore commands’ own --help — every flag documented with the same constraints + examples as above.