Skip to content

Polite-msg: load-bearing report-backs can still silently dead-letter (box_not_empty), + no way to clear the graveyard #544

Description

@dotdevdotdev

Context — what surfaced

agentwire doctor now reports dead-lettered report-backs sitting in session inboxes (~/.agentwire/inbox/<session>/dead/). At time of writing: 48 corpses across 6 recipients (agentwire 27, fragmentz 15, orchestrator 2, agentwire-dev 2, playchek 1, documentscribe 1), 47 of them kind: done report-backs + 1 note, spanning 06-20 → 06-28.

Root-cause finding (investigated)

The acute bug is already fixed by #524, and these corpses are a pre-fix graveyard that #524's own new doctor-surfacing exposed for the first time:

So: not a live regression. But the investigation exposes two real residuals.

Residual A — box_not_empty still silently dead-letters load-bearing kinds

inbox._bump_attempts: when the box is parseable but non-empty (box_not_empty), attempts increment and the message dead-letters after MAX_ATTEMPTS = 40 (~40 min at the ~60s drain tick). For a note that's fine. For a done/escalation report-back it's a silent loss — surfaced only by doctor / msg dead (pull), never pushed. 47/48 corpses were done, so the reliability impact was real pre-fix and the path still exists.

box_not_empty fires when the recipient's input box holds text for 40 consecutive minutes — a genuine human draft (correct to defer, don't clobber) or a Claude Code queued-input placeholder (see Residual B).

Options:

  1. On dead-letter of a load-bearing kind (done/request/escalation), escalate to the human out-of-band (email/quo/desktop toast) instead of silently dropping to dead/. The drop becomes non-silent at the human level.
  2. Deliver at the turn boundary: the idle/Notification hook already fires when a session goes idle (empty box) — drain the inbox on that event so report-backs land at the next safe moment rather than racing a 60s poll against a busy box.
  3. Keep box_not_empty pending-forever like target_busy only when it's not a genuine human draft (see B), so a load-bearing message is never aged out by transient box occupancy.

(The #524 review already called escalation-into-the-pane an "accepted tradeoff" — this is the inverse: escalate out of the pane to the human, no clobber.)

Residual B — queued-input placeholder may be misread as a human draft

prompt_router.prompt_is_empty is conservative: any non-empty box content → defer. Its docstring explicitly calls out the Claude Code "Press up to edit queued messages" placeholder as a non-empty case. That placeholder is not a human draft — clobber risk is nil — yet it's classified box_not_empty and burns attempts toward dead-letter. If a recipient sits with queued input for 40 min, a report-back dies for no good reason. Consider detecting that placeholder and treating it like target_busy (keep pending, don't penalize) rather than like a human draft.

Residual C — no way to clear the graveyard

doctor surfaces dead-letters but there's no command to purge them, so stale pre-fix corpses accumulate visibly forever. Add agentwire msg dead --purge [-s <session>] [--older-than <dur>] (MCP msg_dead purge flag), and/or auto-age-out corpses older than N days. (The current 48 are all pre-#524 and obsolete — a one-time clear is needed regardless.)

Pointers

Built by dotdev.dev

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions