Skip to content

🧹 Retire l7-benchmark + BENCHMARK.md from plugin repo; move benchmarks to the benchmarks repo #595

@ZaxShen

Description

@ZaxShen

Why

Follow-up to #593/#594 (branch fix/retire-sonnet-model-default-delete-contradicted-benchmark-claim, trajectory issue #37). The plugin repo carries a full flawed-era benchmark narrative that our newer campaign contradicts, and it's intertwined: deleting docs/contributing/BENCHMARK.md alone breaks links in 9 files. Decision: retire it from the plugin repo entirely and let the separate benchmarks repo be canonical for methodology + receipts.

Scope

  • Delete docs/contributing/BENCHMARK.md.
  • Delete / retire tests/l7-benchmark/ (the whole tree — README.md, RESULTS.md, tasks/, run-l7*.sh, lib/). Its canonical home is the benchmarks repo.
  • Move any still-useful methodology or per-task receipts (RESULTS.md) into the benchmarks repo first, so nothing measured is lost (honesty bar: receipts are preserved, not discarded).
  • Remove all now-dead links to BENCHMARK.md / tests/l7-benchmark: docs/reference/REFERENCE.md:17, and any others surfaced by tests/l1-lint/link-check.sh.
  • Strip the contradicted benchmark bullet in CHANGELOG.md (v0.7.0-rc.3 'Measured' section, line ~149: '~61% token reduction … 8/8 … beats raw 6/8 · $10.31 … See docs/contributing/BENCHMARK.md') — it repeats numbers the project has retracted and points at a deleted file.
  • Update the world-model / docs/ index summaries that mention contributing/BENCHMARK (e.g. docs/ README table).

Acceptance

  • bash tests/l1-lint/link-check.sh passes (no dangling links).
  • No flawed-era benchmark numbers remain in the plugin repo.
  • benchmarks repo holds the migrated receipts/methodology.
  • Whether this touches CHANGELOG history is fine to do here since the figures are formally retracted; Human reviews the PR.

Notes

Keep the tests/l7-benchmark/ runner only if there's a reason to keep an L7 layer in-repo; current decision is full move to the benchmarks repo. Confirm at PR time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsDocumentation-only changes

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions