Skip to content

task: Add dead-letter auto-requeue #103

Description

@pureliture

Outcome

Operators can safely return transient verifier dead-letter jobs to pending after a cooldown without introducing SQS, LocalStack, Kafka-style offsets, outbox storage, new tables, or new GSIs.

Scope

  • Split dead-letter classification into terminal reason and root error class.
  • Add bounded dead-letter auto-requeue CLI behavior with dry-run by default and explicit apply mode.
  • Restrict automatic recovery to transient verifier failure classes.
  • Persist an auto-requeue counter and enforce one automatic return per job.
  • Add guarded NoSQL conditional recovery behavior.
  • Add conservative personal systemd service/timer units, without enabling them by default.

Acceptance Criteria

  • Dry-run reports candidates without mutating queue state.
  • Apply mode requires an explicit flag and only moves eligible jobs back to pending.
  • Default job type is verify.
  • Non-transient, malformed, unknown, scanner-runtime, and retry-budget-only failures are not automatically requeued.
  • Cooldown, limit, page cap, compare-and-set checks, and one-shot auto-requeue guard are enforced.
  • Runtime, CLI, storage adapter, and systemd unit tests cover the behavior.
  • Operational proof covers a synthetic dead-letter verify job and a one-shot systemd service run without enabling the timer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions