Bandwidth Troubleshooting¶
Audience: operators managing edge node relay delivery bandwidth.
Background¶
Each edge node’s relay subsystem enforces two optional bandwidth controls:
Rate limit (
bytes_per_second): A token-bucket rate limiter that refills at the configured rate. Segments are throttled if the bucket is empty.Daily quota (
daily_quota_bytes): A rolling 24-hour byte budget. Segments are dropped from the delivery queue if the daily quota is exhausted.
Both controls default to unlimited (0 = unlimited). Throttled relays are transitioned to
Failed state and retried with backoff — they are not permanently deadlettered by
throttling alone.
1. Check current bandwidth status¶
edgectl relay config get [--socket /run/edged/rpc.sock]
Example output (unlimited):
Relay Configuration
Enabled: true
Workers: 4
Success Condition: ack
Dial Timeout: 10s
Ack Timeout: 30s
Bandwidth
Rate Limit: unlimited
Daily Quota: unlimited
Available Tokens: (n/a — unlimited)
Daily Used: (n/a — unlimited)
Example output (rate-limited, quota set):
Relay Configuration
Enabled: true
Workers: 4
Success Condition: ack
Bandwidth
Rate Limit: 1,048,576 bytes/sec (1 MiB/s)
Daily Quota: 10,737,418,240 bytes (10 GiB)
Available Tokens: 524,288 (50% of burst)
Daily Used: 3,221,225,472 (3 GiB / 10 GiB, 30% consumed)
Throttle Count: 14
Quota Drops: 0
Key fields¶
Field |
Meaning |
|---|---|
|
Current token-bucket depth. If near 0, relays are being throttled. |
|
Bytes delivered today. Compare to |
|
Total relay attempts blocked by the rate limit since last restart. |
|
Total relay attempts dropped because daily quota was exhausted. |
2. Diagnose throttle events¶
Symptom: relay queue growing, inflight not clearing¶
watch -n 5 'edgectl relay status --socket /run/edged/rpc.sock 2>&1 | grep -E "Scheduled|Inflight|Failed|Throttle"'
If Failed count is growing and Throttle Count in relay config get is incrementing,
the rate limit is the cause. Throttled relays move to Failed and re-enter Scheduled
after their backoff expires — the queue will continue growing if the rate limit is too
low for the delivery volume.
Symptom: deliveries stop completely during a window¶
If Quota Drops is non-zero, the 24-hour rolling quota has been exhausted. No new
delivery attempts will succeed until the quota window resets (24h after the first byte
was counted in the current window).
Check the audit log for quota events:
# Look for relay.bandwidth.quota_exceeded in the slog output
autonomy audit query --audit-dir "$AUTONOMY_AUDIT_DIR" --category relay --limit 20
3. Adjust bandwidth limits¶
Remove rate limit (set to unlimited)¶
edgectl relay config set-bandwidth \
--bytes-per-second 0 \
[--socket /run/edged/rpc.sock]
Set a rate limit (e.g. 2 MiB/s)¶
edgectl relay config set-bandwidth \
--bytes-per-second 2097152 \
[--socket /run/edged/rpc.sock]
Set a daily quota (e.g. 20 GiB)¶
edgectl relay config set-bandwidth \
--daily-quota 21474836480 \
[--socket /run/edged/rpc.sock]
Set both¶
edgectl relay config set-bandwidth \
--bytes-per-second 1048576 \
--daily-quota 10737418240 \
[--socket /run/edged/rpc.sock]
Expected output:
Bandwidth updated
Rate Limit: 1,048,576 bytes/sec
Daily Quota: 10,737,418,240 bytes
Applied: immediately
Configuration changes take effect immediately on the running edged process without
restart. The daily quota accumulator is preserved across config updates (a config update
does not reset the daily counter).
Validation: Both --bytes-per-second and --daily-quota must be ≥ 0.
Negative values are rejected with an error.
4. Verify the change¶
edgectl relay config get [--socket /run/edged/rpc.sock]
Confirm the new limits are reflected. Then watch relay status to verify delivery resumes:
watch -n 5 'edgectl relay status --socket /run/edged/rpc.sock 2>&1 | grep -E "Scheduled|Inflight|Acked|Throttle"'
5. Reference: bandwidth sizing¶
Scenario |
Recommended setting |
|---|---|
Low-bandwidth link (LTE, satellite) |
100–500 KiB/s rate limit; 1–5 GiB daily quota |
Standard broadband edge |
1–5 MiB/s rate limit; 10–50 GiB daily quota |
High-throughput data center |
Unlimited or very high limit |
Contested-connectivity (shared link) |
Rate limit + daily quota; monitor |
Known gaps¶
Rate limit change resets token bucket to full: When
UpdateConfigis called, the token bucket is reset to the new rate limit value. This is intentional to prevent starvation after a limit increase but means a sudden increase allows a burst equal to the new rate limit.Daily quota window is rolling 24h from first byte: There is no midnight-reset option. The window starts when the first byte is delivered and rolls forward from there.
No per-peer bandwidth controls: The current implementation applies rate and quota limits globally across all peers on the node. Per-peer quotas are a follow-on item.
Bandwidth metrics not in Prometheus:
Throttle CountandQuota Dropsare available viaedgectl relay config getonly. Prometheus metric export is a follow-on item.No config file persistence: Bandwidth limits set via
edgectl relay config set-bandwidthtake effect immediately but are not persisted to the config file. Afteredgedrestart, the limits revert to the values in the config file. Update the config file (relay.bandwidth_bytes_per_second,relay.bandwidth_daily_quota_bytes) to persist the change across restarts.