Outcome
Operators can safely return transient verifier dead-letter jobs to pending after a cooldown without introducing SQS, LocalStack, Kafka-style offsets, outbox storage, new tables, or new GSIs.
Scope
- Split dead-letter classification into terminal reason and root error class.
- Add bounded
dead-letter auto-requeue CLI behavior with dry-run by default and explicit apply mode.
- Restrict automatic recovery to transient verifier failure classes.
- Persist an auto-requeue counter and enforce one automatic return per job.
- Add guarded NoSQL conditional recovery behavior.
- Add conservative personal systemd service/timer units, without enabling them by default.
Acceptance Criteria
- Dry-run reports candidates without mutating queue state.
- Apply mode requires an explicit flag and only moves eligible jobs back to
pending.
- Default job type is
verify.
- Non-transient, malformed, unknown, scanner-runtime, and retry-budget-only failures are not automatically requeued.
- Cooldown, limit, page cap, compare-and-set checks, and one-shot auto-requeue guard are enforced.
- Runtime, CLI, storage adapter, and systemd unit tests cover the behavior.
- Operational proof covers a synthetic dead-letter verify job and a one-shot systemd service run without enabling the timer.
Outcome
Operators can safely return transient verifier dead-letter jobs to
pendingafter a cooldown without introducing SQS, LocalStack, Kafka-style offsets, outbox storage, new tables, or new GSIs.Scope
dead-letter auto-requeueCLI behavior with dry-run by default and explicit apply mode.Acceptance Criteria
pending.verify.