RFC: deepen milestones.py — split into a package + surgical eligibility test seam

## Problem

`src/codebugs/milestones.py` is the leakiest module in the codebase: **1718 lines, 18 MCP tools, 7 CLI commands, 4 tables** — a wide, shallow interface whose complexity rivals its implementation. Inside it sit **four bounded contexts that barely reference each other** except through a thin shared spine:

- **Foundation** — milestone/item/audit CRUD (`create_milestone`, `add_milestone_item`, `query_audit`, `_audit`, …)
- **Triage** — `triage_inbox/dismiss/promote` + the `_auto_route_finding` post-add hook
- **Capacity/Pull** — `_capacity_for`/`_eligibility_failure`/`_candidates`/`pull_next`/`release_item`/`get_wip_status`
- **Close-gate/branch** — `mark_branch_only`/`mark_integrated`/`milestone_defer`/`milestone_close`

**The concrete testability friction** (the thing this RFC fixes): two eligibility branches can't be tested in isolation today.
- `_eligibility_failure`'s `linked_frs` loop does raw `SELECT 1 FROM requirements`, forcing a `requirements` fixture for every matrix case.
- `_has_active_blocker` wraps `blockers.query_blockers` in `try/except: return False`, which **conflates "no active blocker" with "no blockers schema"** — the positive-blocker-ineligible case is unreachable without standing up the blockers schema.

So the highest-risk rule in the module (what's eligible to be pulled by a parallel agent) is buried behind I/O into sibling domains.

## Proposed Interface

Convert `milestones.py` → a `milestones/` package, **star topology**, public surface unchanged:

```
src/codebugs/milestones/
  __init__.py   # facade: imports contexts, re-export manifest, then the 4 register_* calls VERBATIM
  _schema.py    # MILESTONES_SCHEMA, SEED_MILESTONES, constants, ensure_schema  (ONE schema owner)
  _spine.py     # _audit, _validate_item_ref, _get_item_by_ref, _milestone_exists,
                #   _get_milestone, _row_to_milestone/_item/_audit   (stdlib + codebugs.types only)
  foundation.py # milestone + item + audit CRUD
  triage.py     # triage_*, _auto_route_finding (post-add hook)
  capacity.py   # THE DEEP ENGINE: capacity math, eligibility, _candidates, pull_next, release_item, wip
  closegate.py  # mark_branch_only, mark_integrated, milestone_defer, milestone_close
```

**Plus a surgical 2-accessor test seam** in the eligibility engine (the only injection — NOT full ports & adapters):

```python
# capacity.py — real 5-positional signature preserved; only the two foreign reads injectable.
def _eligibility_failure(conn, item, milestone, capacity, held, *,
                         has_active_blocker=_real_has_active_blocker,
                         requirement_exists=_real_requirement_exists) -> str | None:
    ...
```

- **Production:** `pull_next(...)` calls `_eligibility_failure(conn, item, milestone, capacity, held)` — defaults bind to the real (late-imported) reads. Capacity-floor math stays inline.
- **Tests:** inject `has_active_blocker=lambda ref: True/False`, `requirement_exists=lambda id: True/False` — drive the full eligibility matrix (incl. the positive-blocker case) with **zero findings/reqs/blockers schemas**.

**Hidden internally:** the 4-context structure, the `BEGIN IMMEDIATE` atomic-claim dance, capacity ledger arithmetic, the `_candidates` priority cascade, close-gate aggregation, audit-string formats.

## Dependency Strategy

**In-process** (SQLite, shared `conn`) for the package internals — merged directly; star topology with `_spine` as the one shared leaf, no context imports another, no new top-level `codebugs.db`/sibling imports (preserves today's late-import discipline). The two cross-domain eligibility reads (blockers, requirements) get a **minimal injected-accessor seam** (the 20% of ports-&-adapters that yields 80% of the win) so the engine can be tested with in-memory fakes; everything else stays plain in-process.

## Testing Strategy

- **New boundary tests:** eligibility matrix via `_eligibility_failure` with fake accessors (no sibling schemas), incl. positive-blocker-ineligible; capacity math via `pull_next`/`release_item`/`get_wip_status` on the 4 milestones tables only; close-gate matrix via `milestone_close`.
- **Frozen-surface regression test (load-bearing, lands FIRST):** asserts the exact 18 MCP tool names (10 explicit `name=` + **8 derived from inner `def` names** — `milestone_create/update/list/status/add_item/move_item/set_status/audit_query`) and 7 argparse subcommand strings. This is the only guard that catches a silent rename of a bare-`@mcp.tool()` foundation tool.
- **Re-export guard test:** imports every name (incl. `_get_item_by_ref`, `AUTO_ROUTER_ACTOR`) from `codebugs.milestones` so a missing `__init__` re-export fails at collection.
- **Old tests to replace:** `tests/test_milestones.py` cases that stand up findings+reqs+blockers *solely* to exercise eligibility → fake-driven engine tests.

## Implementation Recommendations

Durable guidance (not coupled to current line numbers):

- **Owns:** the milestones *domain* — one `register_schema("milestones", …, depends_on=("findings","reqs","blockers"))`, one `MILESTONES_SCHEMA` with its FKs + seed rows. Four files, one schema owner — sub-modules are one domain, not four.
- **Hides:** the four-context split, capacity/eligibility/atomic-claim mechanics, audit formats. External callers (`server.py`, `cli.py`, `db._ensure_modules_loaded`, the post-add hook, autosorter) see exactly today's surface.
- **Exposes (FROZEN):** byte-identical MCP tool names + CLI subcommand strings; `ensure_schema`, `register_tools`, `register_cli`, `_auto_route_finding`; and a complete `__init__` re-export manifest of every symbol the test suite touches.
- **Migration — value/risk order: regression-gate → seam → file moves** (strangler-fig, blame-preserving):
  0. Land the frozen-surface regression test (hard gate) before any file move.
  1. Introduce the 2-accessor seam in the still-single-file module + its fake-driven tests (most of the win, no move risk).
  2. `git mv milestones.py milestones/__init__.py` (pure move, preserves blame).
  3–5. Carve `_schema` / `_spine` / context files one commit each.
  - **Invariant:** every move commit re-imports the moved symbol into `__init__` *in the same commit*; no intermediate red commits; `uv run python -m pytest tests/ -v` green at every step.

## Provenance

Full design + the rejected alternatives (A pure-facade / B max-flexibility / C WorkDispatcher-class / D full ports-&-adapters) and the **3-agent adversarial review** (7/10, mandatory fixes applied — Adversarial Review Corrections appendix) live in:
`docs/superpowers/specs/2026-06-18-milestones-split-recommendation.md`

Candidate #3 from an `improve-codebase-architecture` pass over the codebase.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: deepen milestones.py — split into a package + surgical eligibility test seam #4

Problem

Proposed Interface

Dependency Strategy

Testing Strategy

Implementation Recommendations

Provenance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: deepen milestones.py — split into a package + surgical eligibility test seam #4

Description

Problem

Proposed Interface

Dependency Strategy

Testing Strategy

Implementation Recommendations

Provenance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions