Deadletter Inspection and Retry Workflow

Audience: operators managing edge node relay delivery.

What is a deadletter entry?

A relay deadletter entry is an outbound segment delivery record that has exhausted its retry budget (MaxRetries) or whose segment is missing from local storage at delivery time (SEGMENT_MISSING outcome). The entry transitions to StateDeadletter (terminal) and no further automatic retry attempts are made.

The relay ledger is bound by INV-12: terminal states (Acked, Deadletter) are never exited by the normal delivery path. Manual operator intervention via retry is required to re-queue a deadletter entry. Manual purge permanently removes it.


1. Check relay and deadletter status

Overall relay health

edgectl relay status [--socket /run/edged/rpc.sock]

Example output:

Relay Status
  Enabled:           true
  Workers:           4
  Success Condition: ack

Queue Depth
  Scheduled:         12
  Inflight:          2
  Acked:             847
  Failed:            3
  Deadletter:        5
  Total:             869

Bandwidth
  Rate Limit:        unlimited
  Daily Quota:       unlimited

If Deadletter count is non-zero, proceed to list and inspect.

List deadletter entries

edgectl relay deadletter list [--limit 50] [--socket /run/edged/rpc.sock]

Example output:

SEGMENT-ID              PEER-ID         ATTEMPTS  FIRST-QUEUED              LAST-UPDATED
seg-0a1b2c3d4e5f6789    peer-alpha       8         2026-03-18T06:00:00Z      2026-03-18T09:30:00Z
seg-1a2b3c4d5e6f7890    peer-beta        5         2026-03-18T07:15:00Z      2026-03-18T10:00:00Z
...

5 deadletter entries (showing 5 of 5)

2. Inspect a specific deadletter entry

edgectl relay deadletter inspect seg-0a1b2c3d4e5f6789 peer-alpha \
  [--socket /run/edged/rpc.sock]

Example output:

Deadletter Entry
  Segment ID:      seg-0a1b2c3d4e5f6789
  Peer ID:         peer-alpha
  State:           deadletter
  Attempt Count:   8
  First Queued:    2026-03-18T06:00:00Z
  Last Updated:    2026-03-18T09:30:00Z

Attempt History
  #1  started: 2026-03-18T06:00:05Z  completed: 2026-03-18T06:00:08Z  outcome: FAILED        error: connection refused
  #2  started: 2026-03-18T06:01:20Z  completed: 2026-03-18T06:01:25Z  outcome: FAILED        error: connection refused
  #3  started: (none — ABANDONED)    completed: 2026-03-18T07:15:00Z  outcome: ABANDONED
  #4  started: 2026-03-18T07:15:30Z  completed: 2026-03-18T07:15:31Z  outcome: FAILED        error: dial timeout
  #5  started: 2026-03-18T07:45:00Z  completed: 2026-03-18T07:45:01Z  outcome: SEGMENT_MISSING  error: segment not in local store
  ...
  #8  started: 2026-03-18T09:30:00Z  completed: 2026-03-18T09:30:01Z  outcome: FAILED        error: peer unreachable

Key outcomes to look for:

Outcome

Cause

Action

FAILED

Peer unreachable, connection refused

Verify peer connectivity; if peer is restored, retry

SEGMENT_MISSING

Segment evicted from local storage

Do not retry — purge instead; re-deploy if needed

ABANDONED

edged process crashed mid-attempt

Normal on restart; retry is safe


3. Decision tree

What is the most recent failure outcome?
│
├─ SEGMENT_MISSING → The segment is gone from local storage.
│   Retrying will immediately fail again with SEGMENT_MISSING.
│   → Purge this entry (§5). Re-deploy the artifact if needed.
│
├─ FAILED / ABANDONED → Network or peer connectivity issue.
│   Is the peer now reachable?
│   ├─ YES → Retry (§4). The re-queued entry will attempt delivery normally.
│   └─ NO  → Do not retry yet. Fix the peer or network issue first.
│             If the entry is stale and the data is no longer needed → Purge (§5).
│
└─ Connection refused / dial timeout → Same as FAILED above.

4. Retry a deadletter entry

retry transitions StateDeadletter StateScheduled. The AttemptCount is preserved (the retry budget is not reset). Delivery will be attempted again with the normal backoff schedule.

edgectl relay deadletter retry seg-0a1b2c3d4e5f6789 peer-alpha \
  [--socket /run/edged/rpc.sock]

Expected output:

Retried: seg-0a1b2c3d4e5f6789 / peer-alpha → scheduled for re-delivery

Monitor the result:

# Watch the deadletter count drop (or the entry move to Acked)
watch -n 5 'edgectl relay status --socket /run/edged/rpc.sock 2>&1 | grep -E "Deadletter|Acked"'

5. Purge deadletter entries

purge permanently removes the outbound ledger record, its attempt history, and its segment index entry. The original segment data may still exist in local storage; purge only removes the delivery tracking record.

--force is required to execute a purge. Without it, purge runs as a dry-run and shows a count of entries that would be removed.

Purge a single entry

edgectl relay deadletter purge \
  --segment-id seg-0a1b2c3d4e5f6789 \
  --peer-id peer-alpha \
  --force \
  [--socket /run/edged/rpc.sock]

Purge all entries for a segment (all peers)

edgectl relay deadletter purge \
  --segment-id seg-0a1b2c3d4e5f6789 \
  --force \
  [--socket /run/edged/rpc.sock]

Purge entries older than N seconds

# Dry-run first to see what would be removed
edgectl relay deadletter purge --older-than 86400 [--socket /run/edged/rpc.sock]

# Execute
edgectl relay deadletter purge --older-than 86400 --force [--socket /run/edged/rpc.sock]

Expected output:

Purged: 3 deadletter entries removed

6. Post-action verification

edgectl relay status [--socket /run/edged/rpc.sock]
# Deadletter count should reflect the retried/purged entries

edgectl relay deadletter list [--socket /run/edged/rpc.sock]
# Remaining entries should only be those not yet actioned

Known gaps

  • Retry does not reset AttemptCount: After retry, the entry resumes from its current AttemptCount. If the peer required many attempts to fail, the entry may reach MaxRetries again quickly. A --reset-retries flag is a follow-on item.

  • No bulk retry by filter: Retrying all deadletter entries for a specific peer requires individual retry calls per segment. Bulk retry by peer filter is a follow-on item.

  • No automatic retention: Deadletter entries are not automatically purged after a configurable age. A retention policy with auto-purge is a follow-on item.

  • Audit trail: Deadletter retry and purge events are logged via slog only. Full audit event persistence to the audit store is a follow-on item.