Skip to content

RFC: deepen milestones.py — split into a package + surgical eligibility test seam #4

@faxik

Description

@faxik

Problem

src/codebugs/milestones.py is the leakiest module in the codebase: 1718 lines, 18 MCP tools, 7 CLI commands, 4 tables — a wide, shallow interface whose complexity rivals its implementation. Inside it sit four bounded contexts that barely reference each other except through a thin shared spine:

  • Foundation — milestone/item/audit CRUD (create_milestone, add_milestone_item, query_audit, _audit, …)
  • Triagetriage_inbox/dismiss/promote + the _auto_route_finding post-add hook
  • Capacity/Pull_capacity_for/_eligibility_failure/_candidates/pull_next/release_item/get_wip_status
  • Close-gate/branchmark_branch_only/mark_integrated/milestone_defer/milestone_close

The concrete testability friction (the thing this RFC fixes): two eligibility branches can't be tested in isolation today.

  • _eligibility_failure's linked_frs loop does raw SELECT 1 FROM requirements, forcing a requirements fixture for every matrix case.
  • _has_active_blocker wraps blockers.query_blockers in try/except: return False, which conflates "no active blocker" with "no blockers schema" — the positive-blocker-ineligible case is unreachable without standing up the blockers schema.

So the highest-risk rule in the module (what's eligible to be pulled by a parallel agent) is buried behind I/O into sibling domains.

Proposed Interface

Convert milestones.py → a milestones/ package, star topology, public surface unchanged:

src/codebugs/milestones/
  __init__.py   # facade: imports contexts, re-export manifest, then the 4 register_* calls VERBATIM
  _schema.py    # MILESTONES_SCHEMA, SEED_MILESTONES, constants, ensure_schema  (ONE schema owner)
  _spine.py     # _audit, _validate_item_ref, _get_item_by_ref, _milestone_exists,
                #   _get_milestone, _row_to_milestone/_item/_audit   (stdlib + codebugs.types only)
  foundation.py # milestone + item + audit CRUD
  triage.py     # triage_*, _auto_route_finding (post-add hook)
  capacity.py   # THE DEEP ENGINE: capacity math, eligibility, _candidates, pull_next, release_item, wip
  closegate.py  # mark_branch_only, mark_integrated, milestone_defer, milestone_close

Plus a surgical 2-accessor test seam in the eligibility engine (the only injection — NOT full ports & adapters):

# capacity.py — real 5-positional signature preserved; only the two foreign reads injectable.
def _eligibility_failure(conn, item, milestone, capacity, held, *,
                         has_active_blocker=_real_has_active_blocker,
                         requirement_exists=_real_requirement_exists) -> str | None:
    ...
  • Production: pull_next(...) calls _eligibility_failure(conn, item, milestone, capacity, held) — defaults bind to the real (late-imported) reads. Capacity-floor math stays inline.
  • Tests: inject has_active_blocker=lambda ref: True/False, requirement_exists=lambda id: True/False — drive the full eligibility matrix (incl. the positive-blocker case) with zero findings/reqs/blockers schemas.

Hidden internally: the 4-context structure, the BEGIN IMMEDIATE atomic-claim dance, capacity ledger arithmetic, the _candidates priority cascade, close-gate aggregation, audit-string formats.

Dependency Strategy

In-process (SQLite, shared conn) for the package internals — merged directly; star topology with _spine as the one shared leaf, no context imports another, no new top-level codebugs.db/sibling imports (preserves today's late-import discipline). The two cross-domain eligibility reads (blockers, requirements) get a minimal injected-accessor seam (the 20% of ports-&-adapters that yields 80% of the win) so the engine can be tested with in-memory fakes; everything else stays plain in-process.

Testing Strategy

  • New boundary tests: eligibility matrix via _eligibility_failure with fake accessors (no sibling schemas), incl. positive-blocker-ineligible; capacity math via pull_next/release_item/get_wip_status on the 4 milestones tables only; close-gate matrix via milestone_close.
  • Frozen-surface regression test (load-bearing, lands FIRST): asserts the exact 18 MCP tool names (10 explicit name= + 8 derived from inner def namesmilestone_create/update/list/status/add_item/move_item/set_status/audit_query) and 7 argparse subcommand strings. This is the only guard that catches a silent rename of a bare-@mcp.tool() foundation tool.
  • Re-export guard test: imports every name (incl. _get_item_by_ref, AUTO_ROUTER_ACTOR) from codebugs.milestones so a missing __init__ re-export fails at collection.
  • Old tests to replace: tests/test_milestones.py cases that stand up findings+reqs+blockers solely to exercise eligibility → fake-driven engine tests.

Implementation Recommendations

Durable guidance (not coupled to current line numbers):

  • Owns: the milestones domain — one register_schema("milestones", …, depends_on=("findings","reqs","blockers")), one MILESTONES_SCHEMA with its FKs + seed rows. Four files, one schema owner — sub-modules are one domain, not four.
  • Hides: the four-context split, capacity/eligibility/atomic-claim mechanics, audit formats. External callers (server.py, cli.py, db._ensure_modules_loaded, the post-add hook, autosorter) see exactly today's surface.
  • Exposes (FROZEN): byte-identical MCP tool names + CLI subcommand strings; ensure_schema, register_tools, register_cli, _auto_route_finding; and a complete __init__ re-export manifest of every symbol the test suite touches.
  • Migration — value/risk order: regression-gate → seam → file moves (strangler-fig, blame-preserving):
    0. Land the frozen-surface regression test (hard gate) before any file move.
    1. Introduce the 2-accessor seam in the still-single-file module + its fake-driven tests (most of the win, no move risk).
    2. git mv milestones.py milestones/__init__.py (pure move, preserves blame).
      3–5. Carve _schema / _spine / context files one commit each.
    • Invariant: every move commit re-imports the moved symbol into __init__ in the same commit; no intermediate red commits; uv run python -m pytest tests/ -v green at every step.

Provenance

Full design + the rejected alternatives (A pure-facade / B max-flexibility / C WorkDispatcher-class / D full ports-&-adapters) and the 3-agent adversarial review (7/10, mandatory fixes applied — Adversarial Review Corrections appendix) live in:
docs/superpowers/specs/2026-06-18-milestones-split-recommendation.md

Candidate #3 from an improve-codebase-architecture pass over the codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions