[DEBT] Move review-finish label reconciliation into wf with thin fallback (Lever F2)

## Context

The `code-review` body asks the model to hand-execute a deterministic
procedure in Step 10/10b: remove stale state labels, apply exactly the
verdict label, then read back labels to verify and create-if-missing. That
is pure logic expressed as prose — paid in tokens every run and a recurring
"model did the label dance slightly wrong" bug class.

The repo already proved the better pattern: `wf` (pick, review-next,
update-next, post-merge, config) lifts deterministic workflow logic out of
the prompt into tested Python. This extends it to the review finish step.

See `docs/context-optimization-plan.md` → Lever F2. **Gated:** do this only
after Lever F1 (and ideally F3) have shown `wf` reliably carrying more of
the workflow, so we act on observed reliability, not faith.

## Requirements (acceptance criteria)

- [ ] New `wf review-finish --verdict <approved|changes-requested|needs-discussion>`
  subcommand performs the Step 10 label reconciliation + Step 10b readback
  verify (resolving label names by purpose key, guarded create-if-missing,
  no `--force`).
- [ ] `wf.py` change is covered by the offline decision-logic tests
  (label-set-in → label-set-out for each verdict, including the
  create-if-missing path).
- [ ] The body calls `wf review-finish` on the happy path and keeps a
  **thin inline fallback** ("apply the verdict label, remove the others" —
  just `gh` calls) for when `wf` errors or Python is absent. The verbose
  Step 10/10b prose is removed from the body.
- [ ] Python is **not** made a hard dependency; the thin fallback preserves
  graceful degradation.
- [ ] One real dry run confirms both the `wf review-finish` path and the
  inline fallback produce the correct label state.
- [ ] Quality gate green; `_shared-skills/` synced if applicable; versions
  bumped.

## Savings goal

**Eliminate** (not defer) the ~30–40 lines of Step 10/10b label-reconcile
prose from the code-review body **and remove the associated bug class** —
label state becomes a tested code path. Goal: the model never hand-executes
the label dance on the happy path.

## Notes

Blocked-by/sequenced-after the Lever F1 story (#TBD) per the plan's
"earn wf-reliability evidence before pushing more logic down".


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DEBT] Move review-finish label reconciliation into wf with thin fallback (Lever F2) #140

Context

Requirements (acceptance criteria)

Savings goal

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[DEBT] Move review-finish label reconciliation into wf with thin fallback (Lever F2) #140

Description

Context

Requirements (acceptance criteria)

Savings goal

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions