diff --git a/CURRENT.md b/CURRENT.md
index 1854562..60de7c0 100644
--- a/CURRENT.md
+++ b/CURRENT.md
@@ -4,7 +4,7 @@
 
 - Project: `security-scanner`
 - Merge mode: `guarded-auto-merge`
-- Active goal: `personal-prod-deploy`
+- Active goal: `ghas-quality-vuln-parity`
 - Last auto merge: `ledger:20260617T003405Z-autopilot-3236f4`
 - Ledger entries: `4`
 - Ledger index hash: `sha256:e1893a649a1101b74a087b5eaaa275813a85708c5bb46c4ae70c24e10a111050`
diff --git a/docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md b/docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
new file mode 100644
index 0000000..3dba8ef
--- /dev/null
+++ b/docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
@@ -0,0 +1,157 @@
+# Agentic Workflow: GHAS급 vuln/SAST 탐지 품질 (CodeQL parity SLO)
+
+**Status:** Ready for long single-goal execution
+**Date:** 2026-06-21
+**Goal ID:** `ghas-quality-vuln-parity`
+**Spec:** `docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design,review}.md`
+**Merge flow:** pull request
+
+장시간 단일 goal 실행 패킷. vuln/SAST 탐지를 **GHAS code-scanning(CodeQL) parity SLO**에 맞추는 측정
+harness + FP-억제 품질 머신을 구축한다. 시크릿 서브트랙(PR #58)의 검증된 2층 구조를 1:1 전이하되,
+**vuln 고유로 durable disposition을 자율층에서 빼 H-track으로** 옮긴다(VulnerabilityFinding이 durable
+store에 미적재 + `set_finding_disposition`이 FINDING_STATE 부재 시 ValueError → storage-projection
+stop-condition). 실 code-scanning live-fetch는 stop-condition, 커밋은 synthetic-or-redacted-only.
+
+## Goal
+
+vuln/SAST의 per-repo 1:1 CodeQL parity 측정 harness + 인라인 FP-억제 티어 + 합성 회귀 게이트 enforce +
+report-only parity 게이트 배선을 synthetic fixture로 TDD 완성하고 PR/CI/merge까지 닫는다.
+
+**완료 기준(자율 goal done = M3):**
+
+- **M1**: code-scanning 도메인 모델 `CodeScanAlertRecord`(redacted) + 매처
+  `compare_codescan_alerts_with_findings`(CWE-교집합 3등급: matched-by-cwe/by-rule-token/unmatched) +
+  적대적 fixture. **precision/recall은 `core/vulnerability/evaluation.py` 재사용(신규 계산 코드 0줄)**,
+  `CodeScanAlertRecord→VulnerabilityEvaluationKey` 어댑터로만. line-window는 진짜 `|alert−finding|≤N`,
+  recall 분모=open+fixed alert만·precision 페널티=dismissed 별도. 네트워크 0.
+- **M2**: 인라인 싼 티어(scan-vuln 후처리: code_flow_count·severity floor·저신뢰 rule 억제) — 결정적·
+  메타데이터-only·억제율 회귀로 보장되는 부분만 default-on, 동작 바꾸는 신규 억제는 gated. 합성 코퍼스를
+  SQLi/XSS/path-traversal/command-injection/SSRF 5종으로 확장 + rule-class 정규화 적용. 기존 scan-vuln
+  default 출력 불변(canary TP 보존).
+- **M3(자율 done)**: 합성 회귀 게이트 enforce(evaluate precision≥0.90/recall≥0.99) + report-only parity
+  게이트 `governance.vuln_parity_slo --check`(threshold yml 부재→report-only, frozen synthetic snapshot
+  대비, 나이>임계→stale-degraded) 배선. 실 snapshot 없이 결정적 재현 증명.
+- 기존 Gitleaks-first secret + 기존 vuln scan/import/report/gate default path 불변.
+- GHAS trigger/upload/alert mutation/**live-fetch 없음**. Architecture review(pre/post-M2/post-M3/final)
+  blocking 0. PR CI + local governance gate 통과.
+
+**H-track(자율 루프 밖, stop-condition PR):** H1 실 code-scanning snapshot 취득 → H2 baseline + fixture-
+vs-real divergence → H3 목표 확정 + parity enforce → **H4 vuln verdict durable disposition 배선(storage
+projection)**.
+
+## Execution Contract
+
+- 단일 장기 goal로 M1~M3을 끝까지. 중간 승인 없음. 사람 개입은 stop-condition 시에만.
+- Subagent 적극 사용(구현 worker gpt-5.5/high; 보조는 repo policy). PR 만들고 CI 통과 후 merge 가능까지.
+- 실 endpoint/host/credential/private path/real SARIF/real code-scanning export/real finding 커밋 금지.
+
+## Fixed Decisions
+
+- Scope: vuln 자율 M1~M3(synthetic-only). 실 fetch·baseline·enforce·durable disposition은 H-track.
+- 측정: CodeQL code-scanning alert oracle, per-repo 1:1, snapshot=ground-truth(frozen synthetic). 계산은
+  `core/vulnerability/evaluation.py` 재사용(제4 엔진 신설 금지). 합성 evaluate와 parity 매처 같은 계산 코어.
+- 매칭: rule-class 정규화 + line-window를 합성 게이트·parity 둘 다 동일 의미론 적용(VFR8 정합).
+- 인라인 티어: 결정적·메타데이터-only 부분 default-on, 동작 변경분 gated. validity-check 아날로그 없음.
+- **durable disposition 금지(자율)**: vuln verdict는 v1 자율에서 기존 throwaway JSONL 유지. durable
+  영속은 storage projection 필요 → `storage-projection-or-schema-migration-required` stop → H4.
+- snapshot: synthetic redacted fixture만 커밋(`source: synthetic` marker 필수, 없으면 fail-closed). 실
+  snapshot은 `.gitignore` + allowed_writes 비포함 이중 차단.
+- **governance 핵심 자율수정 금지**: allowed_writes는 `governance/vuln_parity_slo.py`만(시크릿
+  `governance/parity_slo.py`와 별도 파일). `autopilot_goal.yml`·`autopilot_gate.py`·`public_safety.py`
+  수정 필요 시 stop(scope-expansion) → 사람 PR.
+- 슬롯: 자율 코드는 active_goal 슬롯 없이 머지(머지 시 governance 3파일 main(theirs) 채택). 실 슬롯 전환은 사용자 결정.
+
+## Required Architecture Review Gate
+
+Mandatory blocking. pre-implementation / post-M2 / post-M3 / final. SoT change·scope expansion·unsafe
+data·기존 default 변경 요구 시만 정지; 그 외 in-goal 수정.
+
+## Multi-agent Execution Model
+
+Subagent를 disjoint 책임으로(매처/모델 Worker A, 인라인 티어 Worker B, 합성 게이트+parity_slo Worker C,
+architecture/security reviewer read-only, code_simplifier). Main agent 통합·최종 판단.
+
+## Allowed Write Surface
+
+`governance/autopilot_goal.yml`의 `allowed_writes`가 authoritative. 요약: 승격 spec, 이 workflow 문서,
+src/tests/eval/examples, `governance/vuln_parity_slo.py`(신규 게이트만), ledger, CURRENT.md. **`governance/**`
+광역 아님** — 그 밖 governance 변경은 scope expansion 정지.
+
+## Suggested Work Plan
+
+### Readiness (M0 = goal-setup, 이미 orchestrator가 수행)
+goal-setup(spec 승격 + autopilot_goal.yml goal_id + current.yml active_goal + CURRENT.md 원자 커밋)은
+orchestrator가 완료. 너는 pre-implementation architecture review부터 시작.
+
+### M1 측정 substrate
+1. red-first: 매처 CWE/rule-token/line-window/dismissed 채점; 적대적 fixture(CWE-부재/라인드리프트/
+   CodeQL↔Semgrep 다른 rule.id/dismissed)에서 정규화·윈도·필터 누락이 red; precision/recall이
+   `core/vulnerability/evaluation.py`에서 산출; 분모 state-aware.
+2. 구현: CodeScanAlertRecord, 어댑터, 매처(신규 precision/recall 계산 0줄). line-window N fixture 확정.
+
+### M2 인라인 티어 + 합성 강화
+1. red-first: 안전 코드 FP 억제 + 취약 recall 유지(evaluate gate), default-on이 recall≥0.99 안 깸,
+   기존 default 출력 불변, 독립 적대 쌍 회귀.
+2. 구현: 인라인 gating(default-on/gated 경계), 합성 코퍼스 5종 + rule-class 정규화. post-M2 review.
+
+### M3 합성 게이트 + parity_slo (자율 done)
+1. red-first: 합성 회귀 게이트 enforce; `governance/vuln_parity_slo.py` report-only(threshold 부재)·
+   frozen synthetic snapshot 대비·stale-degraded.
+2. 구현: vuln_parity_slo.py. final review → PR. CURRENT.md에 "parity SLO enforce 미달성, H-track 대기".
+
+## Required Local Checks
+
+```bash
+uv run pytest
+uv run python -m governance.render --validate
+uv run python -m governance.render --check
+uv run python -m governance.rebuild_ledger_index --check
+uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check
+uv run python -m governance.public_safety --diff origin/main...HEAD
+uv run python -m governance.public_safety --path docs/workbench/specs/ghas-quality-vuln-subtrack
+uv run python -m governance.vuln_parity_slo --check
+uv run python -m governance.autopilot_gate --base origin/main
+```
+
+## Stop Conditions
+
+`governance/autopilot_goal.yml`의 `stop_conditions`(정본 16). 핵심: ghas-live-fetch-or-mutation-required
+(H1 실 fetch), **storage-projection-or-schema-migration-required**(durable disposition·snapshot durable →
+H4), existing-secret-default-behavior-change, architecture-review-blocking-finding, public-safety-hit,
+scope-expansion(governance 핵심 파일 수정), same-blocker-three-times, break-glass.
+
+## Resume Prompt
+
+```text
+Goal: complete `ghas-quality-vuln-parity` in the security-scanner repo through a PR.
+
+Read first:
+- AGENTS.md
+- governance/autopilot_goal.yml
+- docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
+- docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design,review}.md
+- src/security_scanner/core/vulnerability/{evaluation,model}.py
+- src/security_scanner/baseline/ghas_api/__init__.py
+- src/security_scanner/runtime/vulnerability_verify_artifact.py
+- src/security_scanner/cli/commands (import-sarif/scan-vuln/report/gate/evaluate)
+
+Implement M1~M3 (autonomous, synthetic fixtures only, no real GHAS/code-scanning):
+M1 CodeScanAlertRecord + compare_codescan_alerts_with_findings matcher (CWE 3-tier) + adversarial
+   fixtures. Reuse core/vulnerability/evaluation.py (zero new precision/recall code). True |line|<=N
+   window, state-aware denominators.
+M2 inline cheap tier (metadata-only default-on / gated for behavior change), synthetic corpus 5 CWE
+   classes + rule-class normalization. Existing scan-vuln default output unchanged.
+M3 synthetic regression gate enforce + report-only parity gate governance.vuln_parity_slo --check.
+
+Do NOT: durable-persist vuln verdict (storage projection -> H4 human-gated), call/fetch GHAS code-
+scanning, commit real SARIF/findings, modify governance/autopilot_goal.yml | autopilot_gate.py |
+public_safety.py (allowed_writes = governance/vuln_parity_slo.py only), change existing secret/vuln
+scan defaults. Real snapshot fetch, baseline, enforce, durable disposition are human-gated H1~H4.
+Use multi-agent. Mandatory architecture gates: pre-implementation, post-M2, post-M3, final. Finish
+by opening a PR, waiting for CI, merge when green. Autonomous done = M3; record "parity SLO enforce
+pending H-track" in CURRENT.md.
+
+Required checks: pytest; render --validate/--check; rebuild_ledger_index --check;
+render_github_ruleset --check; public_safety --diff and --path docs/workbench/specs/ghas-quality-vuln-
+subtrack; vuln_parity_slo --check; autopilot_gate --base origin/main.
+```
diff --git a/docs/workbench/specs/ghas-quality-vuln-subtrack/design.md b/docs/workbench/specs/ghas-quality-vuln-subtrack/design.md
new file mode 100644
index 0000000..e840ba3
--- /dev/null
+++ b/docs/workbench/specs/ghas-quality-vuln-subtrack/design.md
@@ -0,0 +1,426 @@
+# GHAS급 탐지 품질 트랙 — VULN/SAST 서브트랙 Design (v2, 리뷰 반영)
+
+> SoT: `requirements.md`(이 파일과 같은 dir, locked). 이 문서는 **오토파일럿이 단일 goal로 자율
+> 시퀀싱**해 자율층 스펙을 달성하도록 설계한다. 작성 2026-06-21.
+> v2: 멀티에이전트 리뷰(31건: blocker 6·major 13·minor 8·nit 4) 반영. `review.md` 참조.
+> 상위/형제: `.claude/specs/20260620-ghas-quality-track/`(시크릿 서브트랙 design v2 + review). 이 트랙의
+> blocker/major는 거의 전부 시크릿 트랙(PR #58)이 review로 잡고 해소한 항목의 **vuln 평행**이라,
+> v2의 골격은 시크릿 design v2 구조의 1:1 전이다.
+
+## 0. 단일 Goal (오토파일럿 north star)
+
+> **vuln/SAST 탐지 품질을 GHAS code-scanning(CodeQL) parity SLO에 도달시킨다.** 단 실행은 **2층**으로
+> 가른다(시크릿 트랙 검증 구조 전이, 리뷰 AP-01):
+>
+> - **자율 goal done(autopilot 단일 goal, M1~M3)**: parity 매처 + 인라인 싼 티어 + 합성 회귀 게이트를
+>   **synthetic/fixture만으로** TDD 구축·증명하고, **report-only parity 게이트를 synthetic fixture로
+>   배선**까지. 실 GHAS 무접촉. PR merge로 done.
+> - **human-gated 운영층(H1~H4, stop-condition PR)**: 실 code-scanning snapshot 취득 → baseline 측정
+>   → measure-first 목표 확정 → parity enforce 전환 + **vuln verdict durable disposition 배선**. 자율
+>   루프 밖.
+
+**done 정의 명확화(리뷰 `report-only-enforce-unreachable`/AP-01/AP-08)**: 자율 goal done = **M3**
+(매처 + 인라인 티어 + 합성 회귀 게이트 enforce + report-only parity 게이트 배선, synthetic 증명, PR
+merge). requirements V-Q9의 v1 done(baseline 측정 + 목표 도달 + 실 parity enforce)은 **H1~H4 완료
+후에만** 성립. PR merge 시 CURRENT.md에 "parity SLO enforce 미달성, H-track 대기" 명시.
+
+**vuln 고유 악화 요인(리뷰 AP-02/ARCH-VULN-01)**: 시크릿은 disposition이 이미 durable 배선돼 M3(LLM
+티어 disposition)이 자율층이었지만, vuln finding은 durable store에 **아예 적재되지 않아**(store.py에
+`VulnerabilityFinding` 참조 0건) durable disposition을 만들려면 storage projection 신규 =
+`storage-projection-or-schema-migration-required` stop-condition. 따라서 **durable disposition은
+자율층에서 빠지고 H-track으로 연기** → vuln 자율 범위가 시크릿보다 좁다.
+
+## 1. 아키텍처 개요 (현 자산 → 목표)
+
+```
+                 ┌──────────────── 자율층 (autopilot single goal, M1~M3) ──────────────────┐
+  scan-vuln       │  [parity 매처] CodeScanAlertRecord ↔ VulnerabilityFinding              │
+  (Semgrep-compat)│   rule-class(CWE)+line-window 매칭 → **core/vulnerability/evaluation.py │
+   VulnerabilityFinding[] │   재사용**(신규 precision/recall 코드 0줄), 어댑터로만 수렴      │
+                  │  [인라인 싼 티어] gate.py 확장(severity/precision/code_flow floor)       │
+                  │   default-on 결정적 부분 + gated 신규 부분 분리                          │
+  synthetic       │  [합성 회귀 게이트] evaluate --category code-vuln (존재, 강화)           │
+   code-vuln 코퍼스│   rule-class+line-window EvaluationKey 의미론(parity와 공유)            │
+   + fixture ────►│  [CI 게이트] governance.vuln_parity_slo --check: threshold 부재→          │
+                  │   report-only(synthetic fixture 측정·리포트만), 존재→enforce(H-track 후) │
+                  │   snapshot 나이>임계→stale-degraded                                      │
+                  └─────────────────────────────────────────────────────────────────────────┘
+                 ┌──────────────── human-gated 운영층 (H1~H4, stop-condition PR) ───────────┐
+  실 GHAS         │  baseline/ghas_api(GET-only) → code-scanning fetch(NEW) → 실 redacted     │
+  code-scanning   │  frozen snapshot(local 비커밋) → baseline 측정 + fixture-vs-real          │
+  (human-PR)      │  divergence 보고 → 목표 확정 → parity enforce 전환                       │
+                  │  + vuln verdict **durable disposition 배선**(storage projection 신규)     │
+                  └─────────────────────────────────────────────────────────────────────────┘
+```
+
+**불변(시크릿 트랙과 공유):** snapshot=ground-truth, per-repo 1:1, measure-first, human-PR fetch
+게이트, 공개안전 redaction([[vuln-redaction-design]]), GhApiRunner GET-only 계약.
+
+**vuln 고유 신규 컴포넌트:** code-scanning fetch(VFR2, H-track), rule-class 매처(VFR1/V-Q2, 자율),
+vuln durable disposition(VFR5/V-Q6, **H-track으로 연기** — 리뷰 C), dismissed_reason 채점(VFR6/V-Q4).
+
+## A. Autopilot Execution Shape — goal-setup 시 `governance/autopilot_goal.yml` 반영 (리뷰 blocker AP-03/vuln-sot-path/vuln-governance-wildcard 반영)
+
+> **지시: 아래를 그대로 복사하지 말고, 현행 `phase-2a` goal.yml을 base 템플릿으로 두고 diff만 얹어라**
+> (시크릿 major `acceptance-checks-drift` 전이). 누락 게이트 방지.
+
+- `goal_id`: `ghas-quality-vuln-parity`
+- `execution_mode`: `long-single-goal` / human_gate: `stop-conditions-only` / merge_flow: `pull-request`
+- **SoT 위치 결정(blocker `vuln-sot-path-gitignored-gate-blind`/VD-01)**: 리뷰된 spec을
+  **`docs/workbench/specs/ghas-quality-vuln-subtrack/`로 승격(migrate)** 하고 git 추적. 현
+  `.claude/specs/`는 `.gitignore:72`(`.claude/*`, skills만 예외)라 게이트가 `outside allowed_writes`로
+  차단하고 public_safety가 SoT를 스캔하지 못한다(`git check-ignore` 확인). grill 원본은 `.claude/specs/`에
+  두고 커밋본만 승격. **이 승격을 M0/goal-setup 산출물로 명시.**
+- `allowed_writes`(화이트리스트): `docs/workbench/specs/ghas-quality-vuln-subtrack/**`,
+  `docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md`,
+  `src/security_scanner/**`, `tests/**`, `eval/**`, `ledger/**`, `CURRENT.md`,
+  **`governance/vuln_parity_slo.py`(신규 게이트만)**.
+  **`governance/**` 광역 금지(blocker `vuln-governance-wildcard-self-modify`)** — `autopilot_goal.yml`·
+  `autopilot_gate.py`·`public_safety.py` 자율 수정 금지(Fixed decision), 필요 시 사람 PR. 시크릿
+  `governance/parity_slo.py`와 vuln `governance/vuln_parity_slo.py`는 **별도 파일**(공유 아님, §5 못박음).
+- `acceptance_checks`(phase-2a와 1:1 정렬, diff만): architecture-review **pre/post-M2/post-M3/final**
+  + `pytest` + `render --validate/--check` + `render_github_ruleset --check` +
+  `rebuild_ledger_index --check` + `public_safety --diff origin/main...HEAD` +
+  **`public_safety --path docs/workbench/specs/ghas-quality-vuln-subtrack`** +
+  `autopilot_gate --base origin/main` + **신규 `governance.vuln_parity_slo --check`**(report-only→enforce).
+- `stop_conditions`: **현행 정본 16개 집합을 base로** + 본 트랙 유효분 명시: `ghas-live-fetch-or-mutation-
+  required`(M4/H1 실 fetch), `storage-projection-or-schema-migration-required`(**vuln durable
+  disposition·snapshot durable 적재 경로 — 리뷰 AP-02/VD-06**), `existing-secret-default-behavior-change`,
+  `architecture-review-blocking-finding`, `public-safety-hit`, `scope-expansion`,
+  `same-blocker-three-times`, `break-glass` 등. 임의 부분집합 금지.
+- **goal-setup 원자성(리뷰 AP-03/AP-04, 메모리 deadlock 교훈)**: goal-setup 커밋이 `autopilot_goal.yml`
+  goal_id·`current.yml` active_goal·`CURRENT.md`를 **한 커밋에 동시 갱신**해야 한다(`render.py`가
+  `active_goal != goal_id`이면 검증 실패, `current.yml`은 allowed_writes 밖이라 autopilot 자율 해결
+  불가). Fixed decision.
+- **슬롯 전략(리뷰 major AP-04/`vuln-active-goal-slot-eviction`)**: 현재 `current.yml` active_goal은
+  `personal-prod-deploy`로 점유 중(확인). vuln **자율 코드(M1 매처·M2 인라인·합성 코퍼스·M3 report-only
+  게이트)는 시크릿 패턴대로 active_goal 슬롯 없이** 일반 PR(claude/* 브랜치는 autopilot-gate 면제)로
+  머지 가능하며, **governance 3파일(autopilot_goal.yml·current.yml·CURRENT.md)은 main(theirs) 채택해
+  byte-identical 유지**(self-modification 회피). 실제 슬롯을 vuln goal로 점유 전환할지는 **사용자 결정
+  사항(stop/escalate)** — personal-prod-deploy 완료 후 또는 사용자 승인 하에만. H1~H4(실 fetch·durable
+  projection)만 슬롯/human-gate가 필요한 슬라이스로 분리된다(B의 2층 분리와 정합).
+
+## B. 자율층 / H-track 2층 분리 (리뷰 blocker AP-01)
+
+시크릿 design v2의 자율층(M0~M5)/human-gated(H1~H3) 2층 분리를 1:1 전이하되, vuln 고유로 **M3 durable
+disposition을 자율층에서 빼 H-track으로**(리뷰 AP-02) 옮긴다 → 자율 범위가 시크릿보다 좁다.
+
+- **자율층(M1~M3)**: 네트워크 0·합성/fixture만·default-off/synthetic-first. parity 매처 + 인라인 싼
+  티어 + 합성 회귀 게이트 + **report-only parity 게이트 배선**(synthetic fixture로 결정적 재현 증명).
+  실 GHAS 무접촉. PR merge로 done.
+- **H-track(H1~H4)**: 실 code-scanning snapshot 취득(H1) → baseline 측정 + divergence 보고(H2) →
+  목표 확정 + parity enforce 전환(H3) → **vuln verdict durable disposition 배선(H4, storage projection)**.
+  자율 루프 밖, stop-condition PR.
+
+**자율 goal done 재정의**: 합성 회귀 게이트 **enforce** + report-only parity 게이트가 synthetic fixture
+로 결정적 재현됨(실 snapshot 없이 배선 증명)까지. 실 parity SLO 도달은 H-track 완료 후. PR merge 시
+CURRENT.md에 "parity SLO enforce 미달성, H-track 대기" 명시.
+
+## 2. 검증 가능한 Milestone (자율층 M1~M3 + H-track H1~H4)
+
+각 milestone은 **독립 검증 가능한 done 정의**를 갖는다. **M1~M3은 frozen fixture/합성 대비 자율 진행**.
+**H1~H4는 실 GHAS·baseline·durable projection이라 human-PR/stop-condition 게이트로 격리**.
+
+### M1. 측정 substrate — code-scanning 도메인 모델 + 매처 + 적대적 fixture (자율)
+
+목적: GHAS code-scanning alert을 redacted로 표현하고 우리 finding과 매칭하는 순수-로직 계층. 네트워크
+없음 → 합성/fixture로 완결.
+
+작업:
+- `CodeScanAlertRecord`(신규, `storage/base.py` 또는 `core/vulnerability/`): redacted 필드만 —
+  `repository`, `alert_number`, `rule_id`, `security_severity_level`, `cwe_ids`(rule.tags→CWE),
+  `state`, `dismissed_reason`, `location_path`, `location_start_line`, `location_end_line`,
+  `fetched_at`, `source_tool="ghas-code-scanning"`. (secret `GhasAlertRecord` 평행.)
+- `CodeScanComparisonKey`: `(repository, file_path, line_window, normalized_rule_class)`. §4.2 참조.
+- 매처 `compare_codescan_alerts_with_findings(...)`: **3등급 집계**(matched-by-cwe / matched-by-rule-
+  token / unmatched). **precision/recall은 `core/vulnerability/evaluation.py` 재사용**(리뷰 D — §D).
+- **적대적 fixture(리뷰 F)**: 정규화/line-window/필터 누락이 **red가 되는** 케이스 — (a) CWE-부재
+  rule-token-only, (b) source/sink 라인 드리프트(line-window 밖→윈도 정의 검증), (c) CodeQL↔Semgrep
+  동일취약 다른 rule.id(정규화 누락 시 unmatched), (d) dismissed_reason 케이스.
+
+**done:**
+- `CodeScanAlertRecord` + 매처 단위테스트 그린. CWE 매칭/rule-token fallback/line-window가 fixture로
+  검증되고, dismissed_reason 채점 경로(VFR6)가 테스트로 고정. 네트워크 0.
+- **인변(리뷰 D, `parity-harness-third-engine`/`precision-recall-primitive`)**: 신규 precision/recall·
+  gate 계산 코드 **0줄**, `core/vulnerability/evaluation.py`(VulnerabilityEvaluationResult.precision/
+  recall + threshold) 재사용. `CodeScanAlertRecord`→`VulnerabilityEvaluationKey` **어댑터로만** 수렴.
+- **인변(리뷰 F)**: 위 적대적 fixture에서 정규화/윈도/필터 누락이 red, 정상 케이스 green.
+- **인변(리뷰 H, 분모 공식)**: recall 분모 = open+fixed CodeQL alert만, precision 페널티 =
+  dismissed(fp/used-in-tests) 위치를 우리가 띄운 건수 별도 누적. 이 수식이 매처/테스트에 고정(§4.2).
+- **인변(리뷰 G·VD-07)**: line-window N은 M1에서 fixture 값으로 확정(open question 닫기). CWE 결손률·
+  by-rule-token 구제율을 매처 결과 메타로 노출.
+
+### M2. 인라인 FP-억제 티어 + 합성 코퍼스 강화 (자율)
+
+목적: 공짜 인라인 gating을 scan-vuln에 적용하고, 합성 코퍼스를 recall SLO가 의미있을 규모로 확장.
+
+작업:
+- 인라인 gating 확장(`gate.py` 또는 scan-vuln 후처리): `code_flow_count`(trace=reachability 근거)
+  신호 반영, 저신뢰 rule.id·INFO/LOW severity floor 억제. **validity-check 아날로그 없음**(V-Q3) —
+  순수 메타데이터 기반.
+- **default-on / gated 경계(리뷰 K, `vuln-existing-scan-default-invariance`)**: 결정적·메타데이터-only·
+  억제율 회귀 테스트로 보장되는 부분(이미 gate가 INFO/LOW 비차단)만 **default-on**, **동작을 바꾸는 신규
+  rule 억제**(어떤 rule을 새로 비차단, code_flow 없는 HIGH finding 억제)는 **gated/opt-in**.
+- 합성 코퍼스 확장: `eval/synthetic-code-vuln/`에 취약/안전 코드 쌍 + expected-findings를 핵심 CWE
+  클래스로 확대. **rule-class 정규화(V-Q2)를 합성 expected에도 적용**(§E).
+
+**done:**
+- 인라인 gating이 합성 코퍼스에서 안전 코드 FP를 억제(precision↑)하고 취약 코드 recall 유지를
+  `evaluate`(precision≥0.90/recall≥0.99 gate)로 입증. 회귀 테스트 그린.
+- **인변(리뷰 VD-07)**: 합성 코퍼스가 **SQLi/XSS/path-traversal/command-injection/SSRF 5종**을 커버
+  (done 기준으로 고정, '≥N' placeholder 제거).
+- **인변(리뷰 K, default-on 안전)**: default-on 변경이 합성 회귀 게이트(recall≥0.99)를 깨지 않음 —
+  **canary TP 보존**(out-of-rule이 아닌 핵심 TP는 억제되지 않음). 기존 scan-vuln **default 출력 불변**
+  (기존 노출 finding이 무단 억제되지 않음을 합성 회귀로 고정). default-on 변경이 stop-condition(scope-
+  expansion·existing-default-change)을 치는지 명시 판정.
+- **인변(리뷰 F)**: 우리 룰과 **독립적으로** 작성한 적대 취약/안전 쌍(out-of-rule CWE로 recall<1 의도)
+  에서 회귀 누락이 red.
+- **인변(리뷰 nit `vuln-llm-input-leak`)**: 인라인 신호를 verifier에 더 반영하더라도 redacted-metadata
+  계약 준수(trace는 count/shape만, related_location path 평문 금지).
+
+### M3. 합성 회귀 게이트 + report-only parity 게이트 배선 (자율, 자율 goal done)
+
+목적: 회귀 방지. 합성 회귀 게이트 enforce + parity 게이트는 실 snapshot 없이 **synthetic fixture로
+report-only 배선**(자율 goal의 종착점).
+
+작업:
+- 합성 회귀 게이트(`evaluate --category code-vuln`, recall≥0.99/precision≥0.90) — 이미 존재, CI 배선
+  확인/강화. **매칭 의미론은 parity와 동일 rule-class+line-window EvaluationKey**(§E).
+- **`governance/vuln_parity_slo.py` 신규 게이트**: frozen code-scanning snapshot 대비 재현 측정.
+  **threshold yml 부재/빈값이면 report-only**(synthetic fixture로 측정·리포트만), 존재하면 enforce
+  (H-track baseline 후). **snapshot 나이>임계면 `stale-degraded`**(silent pass 금지, scan-health 선례).
+  자율층에서는 **synthetic redacted fixture**로 게이트 경로가 결정적 재현됨을 증명한다.
+
+**done:**
+- CI가 합성 회귀 게이트를 **enforce**하고, `vuln_parity_slo --check`가 synthetic fixture로 report-only
+  측정·리포트를 결정적 재현(네트워크 0). snapshot 나이 노출. silent staleness 없음.
+- **이 시점이 자율 goal done** — PR merge. CURRENT.md에 "parity SLO enforce 미달성, H-track 대기" 명시.
+- final 아키텍처 리뷰 통과.
+
+> ⚠️ **자율 goal done은 여기까지.** 실 parity SLO 도달(measure-first 목표)은 아래 H-track 완료 후.
+
+### H1. 실 code-scanning snapshot 취득 (HUMAN-PR 게이트)
+
+> ⚠️ GHAS live-fetch 필요 → `ghas-live-fetch-or-mutation-required` stop-condition. 오토파일럿은 자율
+> goal done(M3) 후 여기서 **멈추고 human-PR을 요청**한다.
+
+작업:
+- `baseline/ghas_api`에 code-scanning fetch 추가: `fetch_codescan_alert_records(target, api,
+  tool_name="CodeQL")` → `GET /repos/{repo}/code-scanning/alerts?tool_name=CodeQL&state=...`.
+  - **재사용(리뷰 nit ARCH-VULN-03)**: `GhApiRunner.get_json` GET-only 가드 + 페이지네이션 헬퍼 +
+    `_sanitize_error` redaction 패턴.
+  - **신규**: `CodeScanAlertRecord` 모델, code-scanning 정규화 함수, **신규 `compare-codescan` CLI**
+    (리뷰 VD-05 — `compare-ghas`는 `secret-scanning/alerts` 하드와이어·`--category` 미등록이라 재사용
+    불가, 신규 경로가 기본).
+  - rule.tags→CWE 추출, dismissed_reason 보존, dismissed_comment·raw message 미취득(공개안전). `ref`
+    파라미터로 비교 universe 정렬(default-branch HEAD).
+- fetch 결과를 **redacted frozen snapshot**으로 고정. **저장 매체(리뷰 ARCH-VULN-03·snapshot-redaction)**:
+  실 snapshot은 **gitignore 사설 경로에만 보관·커밋 금지**(시크릿 `.gitignore` + allowed_writes 비포함
+  이중 차단), synthetic fixture만 커밋. snapshot을 durable store에 적재하면 storage-projection
+  stop-condition을 치므로 frozen 파일로 둔다.
+
+**done:** human-PR로 ≥1 GHAS code-scanning-enabled repo의 CodeQL alert을 redacted frozen snapshot
+으로 취득. snapshot이 공개안전 검사 통과(**단 public_safety 통과는 보조 검사이며, 상대경로 누출은
+gitignore 1차 방어가 차단 — 리뷰 snapshot-redaction**). 이후 H2는 이 snapshot 대비.
+
+### H2. parity baseline 측정 + SLO 확정 + divergence 보고 (HUMAN-PR 게이트)
+
+목적: measure-first(V-Q9). frozen snapshot 대비 현 scan-vuln의 precision/recall gap을 측정하고 실
+parity SLO 목표를 확정.
+
+작업:
+- M1 매처로 frozen snapshot ↔ 우리 scan-vuln 결과 per-repo precision/recall 산출(by-cwe/by-rule-
+  token/unmatched 등급별). 집계(micro→macro — 시크릿과 정렬).
+- baseline 수치 보고 → 현실적 SLO 확정: recall ≥ CodeQL의 Y%, precision: dismissed-FP 위치 미탐지율.
+- **fixture-vs-real divergence 1회 보고(리뷰 F)**: 합성 recall SLO와 실 parity baseline이 괴리하면
+  보고(self-fulfilling 신호). **CWE 결손률·by-rule-token 구제율·measure 대상 언어 비중**(리뷰
+  `codeql-only-oracle-language-bias`)을 baseline 1급 메타로 노출.
+
+**done:** baseline precision/recall 리포트 생성(frozen snapshot 대비, 결정적 재현). 실 parity SLO
+목표치가 문서에 확정 기록. divergence·언어 비중 메타 노출. 네트워크 0(H1 snapshot 재사용).
+
+### H3. parity enforce 전환 (HUMAN-PR 게이트)
+
+작업:
+- H2 확정 SLO threshold를 `governance/vuln_parity_slo.py` threshold yml에 커밋 → report-only→enforce
+  전환. 확정 SLO 후퇴 시 PR 차단. snapshot 재취득 SLA(N일/룰셋 변경 시) governance 명시.
+
+**done:** 이중 CI 게이트(합성 회귀 + frozen parity enforce) 그린, 회귀 차단. snapshot 나이 노출.
+**이 시점에 measure-first v1 done** — vuln/SAST가 측정 가능한 GHAS parity SLO에 도달.
+
+### H4. vuln verdict durable disposition 배선 (HUMAN-PR 게이트 — storage projection)
+
+> ⚠️ **리뷰 blocker AP-02/ARCH-VULN-01로 자율층에서 H-track으로 재분류.** vuln finding은 durable
+> nosql store에 적재되지 않아(store.py에 `VulnerabilityFinding` 참조 0건), durable disposition을 만들려면
+> vuln finding state를 durable store에 **신규 projection**해야 한다 →
+> `storage-projection-or-schema-migration-required` stop-condition. **자율 진행 불가, human-PR.**
+
+작업:
+- vuln finding state를 durable store에 projection하는 신규 경로(신규 entity type 또는 FINDING_STATE
+  projection) — §4.3 참조. 이게 선행돼야 disposition 어느 안도 동작.
+- vuln verify 경로(`run_verify_vulnerability_artifact`)가 verdict를 durable disposition에 기록
+  (actor='ollama'/source='verifier', STATE_EVENT 감사). **종단 verdict(TP/FP)만 반영, NEEDS_REVIEW는
+  durable 미기록**(리뷰 nit ARCH-VULN-05 — 시크릿 `disposition_status_for_verdict` 동작 동일).
+- **finding_id 안정성(리뷰 I, `finding-id-not-stable`)**: fallback 경로(`compute_vulnerability_finding_id`
+  가 fingerprint 부재 시 `{rule_id,file_path,line_start,message}` 해시)에서 rule_id 정규화·line drift·
+  message 변경 시 disposition 유실 테스트로 노출 → 안정 키 별도 정의 또는 Semgrep-compat 출력에 stable
+  partialFingerprints 강제를 명시 결정(§4.3).
+
+**done:** vuln verify 후 종단 verdict가 durable store에서 조회되고 STATE_EVENT 감사 기록. 같은 finding
+재-스캔 시 disposition 보존(안정 키 검증 포함). (시크릿 `ollama-verify-periodic-todo` 해소의 vuln 평행.)
+
+## 3. Milestone 의존성 / 시퀀싱
+
+```
+자율층 (autopilot 단일 goal):
+  M1 (모델·매처·적대 fixture) ──┐
+  M2 (인라인+합성)──────────────┼──> M3 (합성 게이트 enforce + report-only parity 배선) = 자율 goal done
+                                                    │
+H-track (stop-condition PR, 자율 루프 밖):           │ PR merge
+  H1 (실 snapshot, human) ──> H2 (baseline+divergence) ──> H3 (parity enforce) = measure-first v1 done
+  H4 (durable disposition, storage projection, human) ── (독립, H1 불요)
+```
+
+- **자율 선행 가능(human 불요):** M1, M2, M3 — frozen fixture/합성으로 완결. 오토파일럿이 병렬/순차
+  자율 진행 후 M3에서 PR merge.
+- **human-PR 게이트:** H1(실 fetch), H2/H3(baseline·enforce, H1 snapshot 의존), **H4(durable
+  disposition, storage projection — H1 불요·독립)**.
+- 권장 오토파일럿 순서: M1 → M2 → M3 (자율, PR merge) → [정지, CURRENT.md에 H-track 대기 명시].
+  H1~H4는 사용자 주도 후속.
+
+## 4. 인터페이스 / 데이터 계약 (설계 확정 + open)
+
+### 4.1 code-scanning alert fetch (VFR2/H1)
+
+```
+GET /repos/{owner}/{repo}/code-scanning/alerts?tool_name=CodeQL&state={open|dismissed|fixed}&ref={default}
+→ CodeScanAlertRecord{
+    rule_id, security_severity_level∈{low,medium,high,critical},
+    cwe_ids(from rule.tags external/cwe/cwe-NNN),
+    state∈{open,dismissed,fixed}, dismissed_reason∈{false positive,won't fix,used in tests,null},
+    location{path,start_line,end_line}  # most_recent_instance.location
+}
+```
+- GhApiRunner GET-only 계약 재사용. dismissed_comment·raw message 미취득(공개안전, V-Q4 NFR).
+- **redaction(리뷰 minor `vuln-snapshot-path-redaction`)**: location.path는 사설 상대경로라
+  public_safety(`identifier.private-path`는 절대경로만 매칭)가 못 잡는다 → 실 snapshot은 gitignore
+  사설 경로 보관·커밋 금지가 **1차 방어**. 커밋이 불가피하면 path를 fingerprint화(line-window는 평문
+  라인 유지). M4/H1 done의 'public_safety 통과'는 필요조건일 뿐 충분조건 아님.
+- **fixed alert staleness(리뷰 H)**: fixed alert의 `most_recent_instance.location`은 라인이 stale할
+  수 있음 → H1 fetch 시 검증 항목으로 둔다.
+
+### 4.2 매칭 키 + 분모 공식 (VFR1/V-Q2) — 3등급 (리뷰 G·H·VD-03·E 반영)
+
+```
+CodeScanComparisonKey = (repository, file_path, line_window, normalized_rule_class)
+  normalized_rule_class 우선순위: CWE 교집합(by-cwe) > rule-token 정규화(by-rule-token) > unmatched
+  line_window: |alert_line − finding_line| ≤ N  (진짜 윈도, start_line//N 양자화 금지 — 리뷰 G)
+              N은 M1에서 fixture로 확정(requirements ±N과 정합)
+```
+- **CWE 다대다 처리(리뷰 G)**: 같은 윈도 내 CWE가 다대다면 **1:1 greedy 최대매칭**(또는 CWE 계층 일치).
+  같은 (file, window)에 서로 다른 취약점(cwe-89 SQLi·cwe-79 XSS)이 공존해도 각각 올바르게 1:1 배정.
+- **rule-token fallback 술어(리뷰 `rule-token-fallback`)**: stop-token(audit/lang/py/python 등) 제거
+  후 **핵심 취약-클래스 토큰의 정확 집합 일치만** 매칭(부분 겹침 금지 — path-traversal↔open-redirect
+  오매칭 방지). CWE 브리지 매핑 테이블 확장을 M1 작업으로 둬 by-rule-token 의존 최소화.
+- **합성 게이트와 의미론 공유(리뷰 VD-03/E)**: 합성 회귀 게이트(`VulnerabilityEvaluationKey`)와 parity
+  매처가 **같은 rule-class 정규화 + line-window EvaluationKey 의미론**을 채택한다(권장안 — 일관성).
+  현 `VulnerabilityEvaluationKey`는 `(file_path, line_start, rule_id)` 완전일치(naive)이므로, M2에서
+  expected-findings 스키마(`ruleId`)에 rule-class 정규화를 **로드 시** 적용하고 line_start를 line-window
+  로 확장한다. 두 게이트가 같은 EvaluationKey 의미론을 공유함을 **VFR8 정합 조건**으로 명시.
+- **분모 공식(리뷰 H, `dismissed-reason-snapshot`)**: `recall 분모 = open+fixed CodeQL alert`(매칭
+  모집단)만. `precision 페널티 = dismissed(false positive/used in tests) 위치를 우리가 띄운 건수`를
+  **별도 누적**. `won't fix`는 TP-비차단으로 분리 집계(recall 포함, precision 페널티 미포함). dismissed
+  모집단의 repo별 밀도 편향은 §7 리스크.
+- **메타 노출(리뷰 G·codeql-only-oracle-language-bias)**: 우리 finding CWE 결손률, by-rule-token 구제
+  비율, parity 측정 대상 언어 비중을 baseline(H2) 1급 메타로 노출. by-rule-token으로 구제된 약매칭은
+  신뢰도 등급으로 분리 집계, 비율 높으면 baseline 신뢰도 경고.
+- **coverage ≠ precision/recall(리뷰 D·precision-recall-primitive)**: `GhasComparisonResult.ghas_coverage`
+  (시크릿 coverage 의미)와 `VulnerabilityEvaluationResult.precision`(TP/(TP+FP))은 **다른 메트릭**.
+  vuln parity 매처가 산출하는 것은 후자(truth-labeled precision/recall)임을 명시.
+
+### 4.3 disposition 채널 (VFR5/H4) — storage projection 필요 (리뷰 blocker AP-02/ARCH-VULN-01)
+
+> **코드 전제 확인**: `set_finding_disposition`(store.py:998-1000)은 `read_finding_state(finding_id)`가
+> None이면 `ValueError("finding state does not exist")`. FINDING_STATE 행은 secret `Finding`을 durable
+> store에 append할 때만 생성된다. `VulnerabilityFinding`은 throwaway `VulnerabilityJsonlStore`에만
+> 존재하고 nosql store에 적재 0건. **두 안 모두 vuln finding state를 durable store에 신규 projection
+> 해야 동작하며, 이는 `storage-projection-or-schema-migration-required` stop-condition → H4 human-gated.**
+
+- **1안:** 시크릿 `set_finding_disposition` 재사용. 단 **선행 조건**: vuln finding state를 durable store에
+  projection(신규 entity type 또는 FINDING_STATE projection)해야 함 → storage projection.
+- **2안:** vuln 전용 `set_vuln_finding_disposition` + vuln 전용 store 파티션. **이 역시 신규 STATE_EVENT
+  projection** → storage projection. (footnote 아님 — 두 안 모두 stop-condition을 친다.)
+- **§4.3 판정 기준**: 1안·2안 어느 쪽이든 새 durable projection이므로 H4(human-gated). v1 자율층의 vuln
+  verifier는 **기존 throwaway JSONL 동작 유지**(durable 아님). `§4.3의 '1차 시도=1안'을 그대로 자율
+  진행하면 autopilot이 동일 ValueError를 same-blocker-three-times까지 반복`하므로, design이 1안=storage
+  projection임을 못박아 자율층에서 시도 자체를 막는다.
+- **finding_id 안정성(리뷰 I)**: durable disposition 키 finding_id는 fallback에서 정규화·drift·message에
+  민감(model.py 확인). H4에서 안정 키 정의 또는 stable partialFingerprints 강제를 명시 결정. (durable
+  disposition이 H4로 연기되므로 이 항목도 H4 전제.)
+
+### 4.4 CI parity 게이트 (VFR8/M3) — report-only/enforce 자동 분기 (리뷰 VD-04/AP-08)
+
+- `governance/vuln_parity_slo.py --check`: threshold yml **부재/빈값 → report-only**(측정·리포트만),
+  **존재 → enforce**(H3 후 threshold 커밋 시). snapshot 나이>임계 → `stale-degraded`(`pass` 아님,
+  silent staleness 금지). 자율층 M3에서는 synthetic fixture로 report-only 경로가 결정적 재현됨을 증명.
+
+## 5. 시크릿 트랙과의 정합 / 의식적 분기 (재확인)
+
+| 측면 | 시크릿 | vuln(이 트랙) | 분기 근거 |
+| --- | --- | --- | --- |
+| oracle endpoint | secret-scanning/alerts | code-scanning/alerts (tool_name=CodeQL) | 별도 API·멀티툴(V-Q5) |
+| 매치 키 | secret_type 4-튜플 1:1 | rule-class(CWE) + line-window 3등급 | rule.id 툴별 상이(V-Q2) |
+| precision/recall 엔진 | core/evaluation/metrics.py 재사용 | **core/vulnerability/evaluation.py 재사용**(신규 계산 0줄) | vuln 전용 엔진 이미 존재(리뷰 D) |
+| 합성 evaluate vs parity 매처 | (해당 없음) | **같은 계산 코어 공유**(제4 엔진 차단) | 합성·parity 의미론 일관(리뷰 D/E) |
+| FP-oracle | resolution(거친 분류) | dismissed_reason(SAST-FP 1급) | dismissed_reason이 더 풍부(V-Q4) |
+| validity check | 연기(evidence-gated) | **아날로그 없음**(reachability=탐지기 책임) | SAST 구조적 차이(V-Q3) |
+| disposition 배선 | **이미 durable**+periodic(자율 M3) | **storage projection 신규 → H-track(H4)** | vuln finding state 미적재(리뷰 AP-02) |
+| 자율 범위 | M0~M5(disposition 포함) | **M1~M3(disposition 제외, 더 좁음)** | durable이 storage projection 필요 |
+| CI 게이트 파일 | governance/parity_slo.py | **governance/vuln_parity_slo.py(별도)** | 두 트랙 파일 충돌 방지(리뷰 governance-wildcard) |
+| 주력 데이터셋 | 실 GHAS snapshot | 합성(주력)+실 snapshot(calibration) | SAST 합성 ground-truth 완전(V-Q7) |
+| 주기 scan 배선 | 완료 | 비용 게이트로 연기 | Semgrep 무거움·#2 500+repo |
+
+## 6. YAGNI / 비채택 (불확실 미래 기능 제외)
+
+- 사후 reachability/taint 재계산 엔진(탐지기 책임, V-Q3).
+- live SAST validity 아날로그(대응물 없음, V-Q3).
+- 멀티툴 oracle 통합(CodeQL 고정, V-Q5).
+- 주기 scan-all vuln 자동 배선(비용 게이트로 격리, V-Q6).
+- import-sarif/codeql.yml 자체 품질 SLO(통로·self-scan, V-Q1).
+- push protection / PR 인라인 차단(상위 트랙 비대상).
+- 제3/제4 precision/recall 엔진 신설(기존 evaluation.py 재사용, 리뷰 D).
+
+## 7. 리스크 / 완화
+
+- **rule-class 매칭 약함 → parity 오측정:** 3등급 집계(by-cwe/by-rule-token/unmatched)로 silent 오분류
+  방지. unmatched 비율·by-rule-token 구제율이 높으면 baseline 신뢰도 경고(리뷰 G).
+- **CWE 결손 비대칭:** 우리 Semgrep-compat finding은 cwe_ids가 비는 경우가 흔함 → by-rule-token 강등.
+  CWE 결손률을 baseline 1급 메타로 노출(리뷰 G).
+- **dismissed FP-oracle repo별 밀도 편향(리뷰 H):** repo마다 dismiss 이력이 달라 FP-oracle 풀 크기가
+  불균등 → per-repo precision 페널티 분모 불안정. baseline에 명기.
+- **CodeQL 언어 커버리지 ≠ Semgrep-compat:** oracle 부재 repo/언어는 per-repo SLO 제외(C-monitor만).
+  측정 모집단=CodeQL 지원 언어임을 baseline 메타·goal done에 범위 한정자로 명시(리뷰 language-bias).
+- **합성 자기참조(우리 룰이 우리가 심은 것만 잡음):** 적대적 fixture(M1/M2 done) + 실 GHAS snapshot
+  calibration(H2 divergence 보고)으로 외부 검증(리뷰 F).
+- **VulnerabilityFinding 별도 모델 → disposition 배선 마찰:** durable disposition을 H4로 분리, storage
+  projection stop-condition 명시(리뷰 AP-02).
+- **finding_id 정규화 민감(리뷰 I):** durable disposition 키가 fallback에서 rule_id/line/message에
+  민감 → H4에서 안정 키 또는 stable partialFingerprints 명시 결정.
+- **snapshot staleness:** passive 노출(scan-health 선례) + `stale-degraded`(silent staleness 금지, NFR).
+
+## 8. 완료 정의
+
+### 자율 goal done (M3, PR merge)
+1. parity 매처(rule-class+line-window, 3등급, CWE 다대다 처리) + 적대적 fixture 그린. 신규 precision/
+   recall 계산 코드 0줄(evaluation.py 재사용).
+2. 인라인 싼 티어(default-on 결정적 + gated 신규 경계), 기존 scan-vuln default 출력 불변.
+3. 합성 회귀 게이트(recall≥0.99/precision≥0.90) **enforce** + `vuln_parity_slo` report-only 게이트가
+   synthetic fixture로 결정적 재현 배선.
+4. 모든 측정·억제 경로 네트워크 0, 공개안전 redaction 정합.
+5. PR merge 시 CURRENT.md에 "parity SLO enforce 미달성, H-track 대기" 명시.
+
+### measure-first v1 done (H1~H4 완료 후)
+6. 실 code-scanning(CodeQL) frozen snapshot 취득(H1, human-PR).
+7. parity baseline 측정 + 실 SLO 확정 + fixture-vs-real divergence·언어 비중 메타(H2, measure-first).
+8. parity enforce 전환 — 이중 CI 게이트(합성 회귀 + frozen parity enforce) 그린, 회귀 차단(H3).
+9. vuln verdict durable disposition 배선(H4, storage projection) — 재탐지 시 disposition 보존.
diff --git a/docs/workbench/specs/ghas-quality-vuln-subtrack/requirements.md b/docs/workbench/specs/ghas-quality-vuln-subtrack/requirements.md
new file mode 100644
index 0000000..6a7fbab
--- /dev/null
+++ b/docs/workbench/specs/ghas-quality-vuln-subtrack/requirements.md
@@ -0,0 +1,287 @@
+# GHAS급 탐지 품질 트랙 — VULN/SAST 서브트랙 Requirements
+
+> Phase 1 (grill-to-spec) **완료 — 승인 대기**. SoT: 이 파일(`requirements.md`).
+> 상위 트랙: `.claude/specs/20260620-ghas-quality-track/`(시크릿 서브트랙, locked).
+> 작성 2026-06-21. 작성 모드: self-driven grill(자문자답 + 리서치 선행).
+
+## 승인 대상
+
+- Source of truth: `requirements.md`
+- Preview companion: `requirements.html` (generated, 검토용 — source 대체 아님)
+
+## 한 줄 목표
+
+security-scanner의 **vuln/SAST 탐지 품질**(precision/recall)을 측정 가능한 GHAS code-scanning
+parity SLO에 맞춘다. 시크릿 서브트랙과 **공유 substrate**(parity 측정 프레임·snapshot=ground-truth·
+disposition 후크·measure-first·governance 게이트)를 재사용하되, **CodeQL/SAST 고유의 alert 의미론**
+(rule 기반·security_severity·멀티툴 rule.id·dismissed_reason FP 신호)에서 의식적으로 갈라진다.
+
+## 시크릿 서브트랙에서 그대로 재사용한 결정 (전이 원칙)
+
+| 전이 결정 | 출처 | vuln에서의 적용 |
+| --- | --- | --- |
+| GHAS parity SLO (alert을 oracle) | 시크릿 Q3 | code-scanning alert이 oracle (secret-scanning alert 대신) |
+| snapshot = ground truth (1회 게이트 fetch→frozen→CI 반복) | 시크릿 Q4 | 동일. 단 code-scanning snapshot 별도 취득 |
+| per-repo 1:1 (풀링 아님) | 시크릿 Q5 | 동일 |
+| non-GHAS B-floor + C-monitor | 시크릿 Q6 | 동일 (Semgrep-compat은 GHAS-off repo에도 적용) |
+| measure-first SLO done | 시크릿 Q10 | 동일 |
+| 실 GHAS fetch는 human-PR 게이트 | 시크릿 FR2 | 동일 stop-condition 재사용 |
+| 티어드 자동 품질 머신 | 시크릿 Q9 | 부분 재사용 (V-Q9 참조 — 인라인 티어 내용은 갈림) |
+
+## 의식적으로 갈라진 결정 (vuln/SAST 고유) — 요약
+
+| # | 결정 | 시크릿과 다른 점 + 근거 |
+| --- | --- | --- |
+| V-Q1 | **범위 = scan-vuln(Semgrep-compat)의 품질** | import-sarif는 통로, codeql.yml은 self-scan(범위밖). oracle은 GHAS **code-scanning** alert |
+| V-Q2 | **매치 키 = rule.id 정규화 필요** | secret_type은 발급처-전역 표준이라 1:1. SAST rule.id는 **툴마다 다름** → CodeQL↔Semgrep rule을 CWE로 브리지 |
+| V-Q3 | **FP 억제 = rule/severity gating + LLM verifier**, validity-check 아날로그 **없음** | secret validity(발급처 API 실검증)에 대응물 없음 — SAST는 데이터플로우 reachability가 그 자리. v1은 no-network |
+| V-Q4 | **dismissed_reason을 FP truth 1급 신호로** | code-scanning dismissed_reason("false positive"/"used in tests"/"won't fix")이 secret resolution보다 SAST-FP에 직결 |
+| V-Q5 | **oracle 툴 고정 = CodeQL** (멀티툴 중) | code-scanning은 멀티툴. CodeQL parity가 "GHAS급"의 표준 의미. Semgrep-on-GHAS 등은 비교 universe 오염 |
+| V-Q6 | **disposition durable 배선이 신규 작업** | 시크릿 verify는 이미 set_finding_disposition 배선. **vuln verify는 throwaway JSONL** — durable store/ledger 미배선이 핵심 갭 |
+| V-Q7 | **데이터셋 = 합성+실GHAS 2층** (시크릿보다 합성 비중↑) | SAST는 합성 취약코드 코퍼스가 성숙(이미 `eval/synthetic-code-vuln` 존재). recall 측정에 합성이 결정적 |
+
+## 자문자답 흐름 (provenance)
+
+### V-Q1. 1차 범위: scan-vuln vs import-sarif vs codeql.yml 중 무엇의 품질인가?
+
+**자문:** vuln 서브시스템은 세 표면이 있다 — (a) `scan-vuln`(Semgrep-compat 로컬 SAST 실행),
+(b) `import-sarif`(외부 SARIF 정규화 통로), (c) `.github/workflows/codeql.yml`(이 repo 자체를
+CodeQL로 스캔). 무엇의 precision/recall을 끌어올리나?
+
+**자답: scan-vuln(Semgrep-compat)의 탐지 품질이 1차 대상.** 근거:
+- `import-sarif`는 **변환 통로**일 뿐 탐지기가 아니다(어떤 SARIF든 받아 정규화). 품질의 주체가 아니라
+  품질 측정의 **입력 경로**다 → FR로 유지하되 "품질 끌어올림" 대상은 아님.
+- `codeql.yml`은 **우리 repo를 스캔하는 self-scan**(공급망 위생). 우리 *제품*의 탐지 품질과 무관 →
+  범위 밖. (단, 역설적으로 codeql.yml이 만드는 SARIF는 합성 외 **실 SAST 샘플** 공급원이 될 수 있어
+  데이터셋 보강처로만 언급 — V-Q7.)
+- 사용자 repo에 실행되는 탐지기는 `scan-vuln`이다. GHAS code-scanning과 1:1 비교 가능한 우리 산출물도
+  scan-vuln 결과(VULN_FINDING)다 → **품질 SLO의 주체 = scan-vuln**.
+
+함의: oracle은 GHAS **code-scanning** alert(secret-scanning 아님). 새 fetch 경로 필요(V-Q5·FR2).
+
+### V-Q2. GHAS parity의 vuln 버전 — alert을 oracle 삼는가? 매치 정의는?
+
+**자문:** 시크릿은 `(repository, file_path, line_start, secret_type)` 4-튜플로 GHAS alert↔finding
+1:1 매칭(`GhasAlertComparisonKey`). secret_type은 GitHub 표준 분류(예 `github_pat`)라 발급처-전역
+1:1이 성립한다. SAST는 그게 안 된다 — CodeQL rule.id(`py/sql-injection`)와 Semgrep rule.id
+(`python.lang.security.audit.sql-injection`)는 **같은 취약점인데 문자열이 다르다**. file+line+rule을
+naive 매칭하면 같은 SQLi를 local-only/ghas-only로 양쪽에 잘못 분류 → parity 측정이 무의미해진다.
+
+**자답: code-scanning alert을 oracle 삼되, 매치 키에 rule.id 정규화 계층을 둔다.** 결정:
+- **비교 키 = `(repository, file_path, line_window, normalized_rule_class)`.**
+  - `normalized_rule_class`: rule.id → **CWE**로 브리지(가능하면). CodeQL alert의 `rule.tags`에
+    `external/cwe/cwe-89` 형태로 CWE가 있고, 우리 SARIF importer는 이미 `cwe_ids`를 추출한다
+    (`sarif.py:_extract_cwe_ids`). 즉 **CWE 교집합 매칭이 가장 견고한 공통축**.
+  - CWE 부재 시 fallback: rule.id 문자열 토큰 정규화(소문자·구분자 통일·툴 프리픽스 제거) 후 부분일치.
+    이건 약한 신호라 "matched-by-cwe / matched-by-rule-token / unmatched"를 **구분 집계**한다(설계 단계
+    품질 등급).
+  - `line_window`: 정확 line 1:1이 아니라 **±N 라인 윈도**(설계 단계 N 확정). 근거: 같은 취약점이라도
+    CodeQL은 sink 라인, Semgrep은 source 라인을 보고할 수 있어 정확 라인 일치는 over-strict.
+- **함의(시크릿 Q3 전이 유지):** GHAS(CodeQL) 미탐인데 우리만 탐지한 finding은 정의상 FP("GHAS급"이
+  목표, "GHAS보다 recall↑"는 비목표). 단 rule-class 매칭이 약하면 "측정 불가"로 빠질 수 있어 V-Q2의
+  3등급 집계가 silent-FP 오분류를 막는다.
+
+### V-Q3. FP 억제 메커니즘 — secret validity-check의 SAST 아날로그가 있는가?
+
+**자문:** 시크릿 트랙의 핵심 FP 억제 후보는 (a) LLM verifier→disposition, (b) path/placeholder/
+context-class 휴리스틱, (c) partner-pattern boost, (d) live validity check(연기됨). SAST엔 무엇이
+대응하나? validity check(발급처에 토큰 유효성 질의)는 SAST에 직접 대응물이 없다 — "이 SQLi가 실제로
+도달 가능한가"는 네트워크 질의가 아니라 **데이터플로우 reachability** 문제다.
+
+**자답: SAST FP 억제 = (1) rule/severity/precision gating + (2) LLM vuln verifier→disposition +
+(3) reachability/trace 신호 활용. validity-check 아날로그는 명시적으로 "없음"으로 둔다.** 결정:
+- **인라인 싼 티어(공짜, 모든 스캔):**
+  - `precision`/`security_severity`/`severity` gating: 이미 `gate.py`가 severity_min/precision_min
+    랭킹을 가짐. SARIF의 `precision` 메타(VERY_HIGH..LOW)와 `code_flow_count`(데이터플로우 trace
+    유무)를 FP 억제 신호로 인라인 적용. **trace가 있는 finding은 reachability 근거가 있어 더 신뢰**.
+  - rule 억제(allowlist/severity floor): 저신뢰 rule.id·INFO/LOW를 기본 비차단(이미 gate가 함).
+- **비동기 LLM 티어:** vuln verifier(`llm/vulnerability/verifier.py`)가 애매한 finding에 verdict →
+  **durable disposition으로 반영**(V-Q6 — 이게 신규 작업). 시크릿과 달리 vuln verifier는 현재
+  throwaway JSONL에만 씀.
+- **validity-check 아날로그:** "없음"으로 명시. SAST의 reachability(taint/data-flow)는 **탐지기
+  내부**(Semgrep dataflow / CodeQL taint) 책임이지 사후 verifier 책임이 아니다. v1은 no-network
+  (시크릿 Q7 전이). reachability를 우리가 사후 재계산하는 건 YAGNI/scope-creep → 비채택.
+
+### V-Q4. GHAS alert state/dismissal을 어떻게 truth로 쓰나?
+
+**자문:** code-scanning alert은 state ∈ {open, dismissed, fixed}, 그리고 dismissed_reason ∈
+{"false positive", "won't fix", "used in tests"}를 갖는다(리서치 확인). 시크릿 트랙 미결정 항목
+"GHAS alert state 처리(open/resolved/dismissed)"의 vuln 버전이다. 무엇을 TP truth로, 무엇을 FP
+truth로 보나?
+
+**자답: open/fixed = TP-truth, dismissed("false positive"/"used in tests") = **명시적 FP-truth**.**
+결정(시크릿보다 강한 신호 활용):
+- **TP oracle:** state ∈ {open, fixed}. GHAS가 실제 취약점으로 인정·추적한 것.
+- **FP oracle(시크릿과 의식적으로 갈림):** dismissed_reason ∈ {"false positive", "used in tests"}는
+  **GitHub가 라벨한 ground-truth FP**다. 우리가 같은 위치를 띄우면 그건 우리도 FP를 띄운 것 →
+  precision 페널티로 **직접 채점**. dismissed_reason "won't fix"는 TP이되 비차단(위험 수용)이므로
+  recall 채점엔 포함, precision 페널티엔 미포함(애매 클래스로 분리 집계).
+  - 근거: secret-scanning의 resolution은 카테고리가 거칠지만(revoked/false_positive/...),
+    code-scanning dismissed_reason은 **SAST-FP 의미가 1급**이라 oracle 신호가 더 풍부. 이걸 안 쓰면
+    GHAS가 가진 가장 값진 라벨을 버리는 것.
+- **함의:** snapshot은 alert state + dismissed_reason을 **redacted 보존**해야 한다(FR2 확장). 단
+  dismissed_comment(자유서술, 경로/코드 누출 위험)는 **취득 안 함**(공개안전).
+
+### V-Q5. 멀티툴 code-scanning에서 oracle 툴을 고정하나?
+
+**자문:** code-scanning은 CodeQL뿐 아니라 업로드된 임의 SARIF 툴(Semgrep, 외부 SAST)을 alert으로
+받는다. `tool_name` 필터가 있다. parity oracle universe에 어느 툴 alert을 넣나? 우리 scan-vuln도
+Semgrep-compat인데, GHAS에 Semgrep alert이 이미 있으면 "우리 vs GHAS-Semgrep"은 거의 동어반복이고,
+"우리 vs CodeQL"은 의미 있는 parity다.
+
+**자답: oracle = CodeQL alert로 고정(`tool_name=CodeQL` 필터).** 근거:
+- "GHAS급"의 시장 표준 의미는 **CodeQL**(GitHub 1st-party taint 엔진)이다. parity 목표를 CodeQL로
+  잡아야 "우리가 GHAS만큼 잡나"가 의미를 갖는다.
+- 멀티툴을 다 oracle에 넣으면 비교 universe가 repo의 우연한 GHAS 설정에 좌우돼 per-repo truth가
+  불안정(시크릿 Q5 per-repo 1:1 정신 위배).
+- **함의(시크릿과 갈림):** secret-scanning은 단일 엔진이라 tool 고정 이슈가 없었다. vuln은 oracle
+  tool을 명시 고정해야 측정이 재현된다. 이건 FR/NFR에 못박는다.
+- 보조: CodeQL은 Python·JS 등 언어 제약이 있다. 우리 Semgrep-compat이 CodeQL 미지원 언어를 스캔하면
+  oracle 부재 → 그 repo/언어는 **per-repo SLO 비대상**(C-monitor만, 시크릿 Q6 전이).
+
+### V-Q6. FP 억제 품질 머신이 언제 도나 + disposition durable 배선
+
+**자문:** 시크릿 Q9는 "티어드 자동"이고, 시크릿 verify는 이미 `set_finding_disposition`(durable
+store + STATE_EVENT ledger)에 배선돼 주기 scan도 혜택받는다([[ollama-verify-periodic-todo]] 해소).
+vuln은? 코드 확인 결과: **vuln verify(`run_verify_vulnerability_artifact`)는 throwaway JSONL에만
+verdict를 쓰고 durable store/ledger에 안 쓴다.** `triage_state`는 JSONL 내부 필드일 뿐. 또한
+`scan_all`(주기 scan)에 vuln 자체가 미배선(secret만).
+
+**자답: 티어드 자동(시크릿 Q9 전이) + vuln disposition durable 배선을 신규 핵심 작업으로.** 결정:
+- 인라인 싼 티어(severity/precision/trace gating·rule 억제)는 모든 scan-vuln에 즉시 적용(공짜).
+- LLM vuln verifier는 자동이되 배치·애매 건에. 결과를 **durable disposition으로 반영** — 신규 배선:
+  - vuln finding의 `finding_id`(이미 partialFingerprints 우선 결정적 id)를 키로 durable disposition
+    store에 verdict 기록(시크릿 `set_finding_disposition`과 같은 채널 또는 vuln 전용 평행 채널 —
+    설계 단계 결정, 단 STATE_EVENT 감사·actor='ollama'/source='verifier' 일관성 유지).
+  - **주의(별도 모델):** vuln finding은 `core.finding.Finding`이 아니라 `VulnerabilityFinding`(별도
+    SARIF-native 모델)이다. 시크릿 disposition 후크(`Verdict`/`Disposition` 매핑)를 그대로 못 쓸 수
+    있다 → 설계에서 (a) vuln triage_state↔Verdict 어휘 통일 vs (b) vuln 전용 disposition 평행 트랙
+    중 택1. 어휘는 이미 양쪽 다 TRUE_POSITIVE/FALSE_POSITIVE/NEEDS_REVIEW로 일치 → 통일이 유력.
+- 주기 경로 혜택: `scan_all`에 vuln 스캔+verify를 배선할지는 **#2 비용 제약**과 충돌 가능(500+ repo ×
+  Semgrep은 무겁다) → v1은 **on-demand scan-vuln + verify 자동 disposition**까지, 주기 scan-all 배선은
+  비용 측정 후 별도 게이트(설계 milestone에서 격리).
+
+### V-Q7. 측정 데이터셋 — 합성 vs 실 GHAS code-scanning repo?
+
+**자문:** 시크릿은 실 GHAS-enabled repo snapshot이 주 oracle, 합성은 보조였다. vuln은? 합성 취약코드
+코퍼스(`eval/synthetic-code-vuln` 이미 존재, expected-findings 스키마 있음)와 실 code-scanning repo
+snapshot 중 무엇이 주력인가?
+
+**자답: 2층 — (1) 합성 코퍼스가 recall/회귀 게이트의 주력, (2) 실 GHAS code-scanning snapshot이
+parity calibration.** 근거(시크릿보다 합성 비중↑):
+- **합성이 더 강력한 이유(SAST 고유):** 취약/안전 코드 쌍을 의도적으로 심을 수 있어 **ground-truth가
+  완벽**하다(어느 라인이 진짜 SQLi인지 우리가 안다). 시크릿은 진짜 크리덴셜을 합성에 못 넣지만(push
+  protection), SAST 취약 패턴은 안전하게 합성 가능 → recall 측정의 결정적 도구. 이미
+  `evaluate`(precision_min=0.90/recall_min=0.99 gate)가 합성 대비 동작.
+- **실 GHAS snapshot의 역할:** 합성은 "우리 룰이 우리가 심은 걸 잡나"(자기참조 위험). **실 CodeQL
+  alert parity**가 "현실 코드에서 GHAS만큼 잡나"의 외부 검증 → calibration/validation(시크릿 Q5의
+  GHAS-repo 역할과 동일). 단 실 fetch는 human-PR 게이트.
+- **데이터셋 정합:** 합성 코퍼스 expected-findings 스키마는 `(filePath, lineStart, ruleId)` — V-Q2
+  rule-class 정규화를 합성에도 적용해야 실/합성 채점이 일관(설계 단계).
+
+### V-Q8. 기존 자산 관계 — main 위 쌓기
+
+**자문(시크릿 Q8 전이):** vuln 자산이 어디까지 와 있나?
+
+**자답: main 위에서 쌓는다.** 현존 자산(코드 확인):
+- `core/vulnerability/`: `model.py`(VulnerabilityFinding, triage_state/verifier_verdict 후크 보유),
+  `sarif.py`(SARIF importer, CWE/OWASP/precision/security_severity/code_flow 추출), `evaluation.py`
+  (precision/recall + gate), `gate.py`(severity/precision gating), `redaction.py`(공개안전, PR #48
+  머지 [[vuln-redaction-design]]).
+- `scanners/semgrep_compatible/`(runner), `runtime/vulnerability_scan.py`(scan-vuln/import-sarif),
+  `runtime/vulnerability_verify_artifact.py`(verify, **단 throwaway**), `llm/vulnerability/`
+  (verifier+prompt, redacted-metadata-only).
+- CLI: `verify --category code-vuln`(배선됨), `report/gate/evaluate --category code-vuln`(배선됨),
+  `import-sarif`/`scan-vuln`(배선됨). compare-ghas는 **secret 전용**(vuln 미지원).
+- 코퍼스: `eval/synthetic-code-vuln/`(스키마 있음, 샘플 1건).
+- **미보유(신규 작업):** code-scanning alert fetch, vuln parity 비교(rule-class 매칭),
+  vuln disposition durable 배선, vuln snapshot harness.
+
+### V-Q9. SLO done-definition
+
+**자문(시크릿 Q10 전이):** measure-first 동일 적용?
+
+**자답: measure-first.** baseline 측정(현 scan-vuln의 CodeQL 대비 precision/recall gap) → 현실적
+목표 확정 → gap 닫음. 단 vuln은 **이중 SLO**:
+- **합성 SLO(이미 존재·강화):** recall ≥ 0.99(심은 취약점 거의 다 잡기), precision ≥ 0.90. 회귀 게이트.
+- **실 GHAS parity SLO(신규·measure-first):** CodeQL alert 대비 per-repo precision/recall 일치율
+  목표(baseline 후 확정). recall은 "CodeQL의 Y%", precision은 "dismissed-FP 위치 안 띄우기".
+- v1 done = (a) code-scanning snapshot 취득 경로 + (b) parity baseline 측정 + (c) 목표 설정 +
+  (d) 인라인+LLM 티어로 gap 닫고 합성 회귀 게이트 그린.
+
+## 기능 요구사항 (vuln/SAST 서브트랙)
+
+- **VFR1 code-scanning parity 측정 harness.** GHAS-enabled repo별로 **code-scanning** alert
+  snapshot(oracle=CodeQL)과 우리 scan-vuln 결과(VULN_FINDING)를 V-Q2 rule-class 매칭으로 1:1 비교해
+  per-repo precision/recall 산출 후 집계. 매칭 등급(by-cwe / by-rule-token / unmatched) 구분 집계.
+- **VFR2 code-scanning snapshot 취득.** `baseline/ghas_api`에 code-scanning alert fetch 추가
+  (`/repos/.../code-scanning/alerts`, GET-only, `tool_name=CodeQL`). alert의 redacted 필드만 보존:
+  number, rule.id, rule.security_severity_level, rule.tags(→CWE만), state, dismissed_reason,
+  most_recent_instance.location(path/start_line/end_line). dismissed_comment·raw message 미취득.
+  실 fetch는 `ghas-live-fetch-or-mutation-required` human-PR 게이트 준수.
+- **VFR3 baseline 측정(measure-first).** 현 scan-vuln의 CodeQL snapshot 대비 precision/recall gap을
+  frozen snapshot 대비 측정 → 실 parity SLO 목표치 확정.
+- **VFR4 티어드 품질 머신.**
+  - 인라인 싼 티어: severity/precision/`code_flow_count`(trace=reachability 근거) gating + 저신뢰
+    rule 억제 → 즉시 FP 억제, 모든 scan-vuln. (validity-check 아날로그 없음 — V-Q3.)
+  - 비동기 LLM 티어: vuln verifier가 애매 finding에 verdict → durable disposition 반영(VFR5).
+- **VFR5 vuln disposition durable 배선(신규 핵심 갭).** vuln verifier verdict를 throwaway JSONL이
+  아니라 durable disposition store + 감사 ledger(STATE_EVENT, actor/source 기록)로 흐르게 한다.
+  `finding_id`(결정적) 키로 재탐지 시 disposition 유지. vuln triage_state↔Verdict 어휘 통일 또는
+  vuln 전용 평행 채널(설계 결정).
+- **VFR6 dismissed_reason FP 채점.** snapshot의 dismissed_reason("false positive"/"used in tests")을
+  FP-oracle로 직접 채점에 사용(우리가 그 위치 띄우면 precision 페널티). "won't fix"는 TP-비차단 클래스로
+  분리 집계.
+- **VFR7 non-GHAS 전이 + drift 모니터.** 증류한 품질 머신(gating+verifier disposition)을 전 repo
+  scan-vuln에 적용. non-GHAS/CodeQL-미지원-언어 repo는 vuln verifier 샘플 drift 모니터(SLO 아님).
+- **VFR8 parity SLO CI 게이트.** frozen code-scanning snapshot 대비 재현 측정을 CI 게이트화(측정 시
+  human-PR fetch 불요). baseline 후 확정된 목표 후퇴 시 차단. **합성 회귀 게이트(`evaluate`,
+  recall≥0.99/precision≥0.90)는 별도 유지** — 둘 다 그린이어야 통과.
+
+## 비기능 요구사항
+
+| 항목 | 요구값 |
+| --- | --- |
+| 오프라인 박스 호환 | 측정·억제 경로에 네트워크/secret egress 없음(snapshot fetch는 게이트된 1회 예외). validity-check 미도입이라 secret 트랙보다 egress 표면 더 작음 |
+| 재현성 | frozen code-scanning snapshot + 합성 코퍼스로 CI 결정적 측정 |
+| 비용 | LLM 티어는 배치·애매 건 한정. Semgrep-compat scan-vuln 자체가 무거우므로 주기 scan-all 배선은 비용 측정 후 별도 게이트(#2 500+ repo 제약) |
+| staleness 가시성 | snapshot 나이/타임스탬프를 출력에 노출(scan-health 선례), silent staleness 금지 |
+| 공개안전 | snapshot·findings redacted([[vuln-redaction-design]] 정합). code-scanning은 raw message/dismissed_comment에 경로·코드 누출 위험 → 미취득 또는 `sanitize_vulnerability_text` 경유 |
+| governance | 실 GHAS code-scanning fetch는 human-PR 게이트 유지. fetch는 GET-only(GhApiRunner 계약 재사용) |
+| oracle 재현성 | parity oracle 툴을 CodeQL로 고정(`tool_name`) — 멀티툴 universe 오염 방지(V-Q5) |
+
+## 사용자 시나리오
+
+- **VS1 baseline.** 운영자가 GHAS code-scanning-enabled repo에서 baseline 측정 → "현 scan-vuln
+  precision/recall이 CodeQL 대비 얼마"를 확인 → measure-first로 parity 목표 설정.
+- **VS2 회귀 게이트.** Semgrep rule/정규화 변경 후 CI가 (a) 합성 코퍼스 recall/precision + (b) frozen
+  code-scanning snapshot parity 둘 다 재측정 → 어느 쪽이든 후퇴 시 PR 차단.
+- **VS3 FP 억제 전이.** 주기/온디맨드 scan-vuln이 non-GHAS repo 돌 때 인라인 gating + LLM verifier가
+  자동 FP 억제, verdict가 durable disposition으로 흘러 재탐지 시 재-asks 억제, 샘플 drift 모니터가
+  전이 건전성 보고.
+- **VS4 dismissed 정합.** GHAS가 "used in tests"로 dismiss한 alert을 우리가 같은 위치에 띄우면 parity
+  채점이 precision 페널티로 잡아내 룰 억제 후보로 노출.
+
+## 범위 밖 / 연기
+
+- **import-sarif 자체 품질**: 변환 통로라 탐지 품질 주체 아님(V-Q1). FR로 유지하되 품질 SLO 대상 아님.
+- **codeql.yml self-scan 품질**: 우리 repo 공급망 위생이라 제품 탐지 품질과 무관(V-Q1). 단 그 SARIF는
+  데이터셋 보강처로만 언급(V-Q7).
+- **사후 reachability/taint 재계산**: 탐지기(Semgrep/CodeQL) 책임. 우리가 사후 재계산은 YAGNI(V-Q3).
+- **validity-check 아날로그**: SAST엔 대응물 없음 — 명시적 비채택(V-Q3).
+- **주기 scan-all에 vuln 배선**: 비용 게이트로 격리. v1은 on-demand scan-vuln+verify까지(V-Q6).
+- **멀티툴 oracle**: CodeQL 고정, 다른 GHAS 업로드 툴은 비교 universe 제외(V-Q5).
+- **push protection / PR-차단**: 상위 트랙 비대상 정합.
+
+## 미결정 항목 (Phase 2 design open questions)
+
+- rule-class 정규화 정밀도: CWE 브리지 매핑 테이블 범위(어느 CWE부터), CWE-부재 fallback 토큰 정규화 규칙.
+- line 매칭 윈도 N(정확 라인 vs ±N) + source/sink 라인 불일치 처리.
+- 비교 universe: HEAD-only vs full-history(code-scanning은 기본 default-branch HEAD ref 중심 —
+  시크릿 full-history와 다를 수 있음. `ref` 파라미터 정렬 필요).
+- vuln disposition 채널: 시크릿 `set_finding_disposition` 재사용(어휘 통일) vs vuln 전용 평행 store.
+  VulnerabilityFinding이 별도 모델인 점이 변수.
+- 집계 방식: per-repo micro vs macro 평균(시크릿과 정렬).
+- CodeQL 언어 커버리지 vs 우리 Semgrep-compat 언어 — oracle 부재 repo/언어 SLO 제외 판정 기준.
+- snapshot 갱신 트리거/주기(passive staleness 노출은 확정, 갱신 정책은 설계).
+- 합성 코퍼스 확장 규모(현재 샘플 1건) — recall SLO를 의미있게 만들 최소 취약 클래스 수.
diff --git a/docs/workbench/specs/ghas-quality-vuln-subtrack/review.md b/docs/workbench/specs/ghas-quality-vuln-subtrack/review.md
new file mode 100644
index 0000000..2189cce
--- /dev/null
+++ b/docs/workbench/specs/ghas-quality-vuln-subtrack/review.md
@@ -0,0 +1,116 @@
+# GHAS급 VULN/SAST 품질 design.md — 멀티에이전트 리뷰 + 반영 기록
+
+> 대상: `design.md`(v1) → 반영 후 `design.md`(v2). 리뷰: 5차원 병렬(opus) → 적대적 검증(sonnet) → 종합.
+> Workflow `wy7vx73el`, agent 43, subagent. **synthesize 세션은 살아 있어 종합(synthesis) 정상 산출.**
+> 확정 지적 **31건**(차원별 리뷰 → 적대적 검증 통과분만). overall(synthesis): **needs-rework → v2에 반영 완료.**
+
+## 종합 판정 인용 (synthesis.overall = needs-rework)
+
+> "vuln design.md를 autopilot 단일 goal 실행에 넘기기 전에 v2 개정이 반드시 필요하다(needs-rework).
+> 핵심은 두 가지다. (1) 시크릿 design의 'Autopilot Execution Shape' 섹션이 통째로 누락되어, 현 상태로
+> goal-setup을 시도하면 첫 커밋에서 autopilot_gate가 차단한다(SoT가 .claude/specs gitignore 경로,
+> goal_id/active_goal 불일치, governance/** 광역 자기수정 위험 — 모두 코드/governance 파일로 실증됨).
+> (2) M3 durable disposition 배선은 vuln finding이 durable store에 전혀 적재되지 않아(store.py에
+> VulnerabilityFinding 참조 0건, set_finding_disposition은 FINDING_STATE 부재 시 ValueError) '자율'
+> 라벨과 달리 storage-projection stop-condition을 정통으로 친다. … 시크릿 트랙(PR #58)의 검증된 구조 —
+> 자율층 M0~M5(synthetic-only, 슬롯 없이 머지) / human-gated H1~H3(실 GHAS) 2층 분리, metrics 엔진
+> 재사용 인변, 적대적 fixture, parity_slo report-only→enforce 자동 분기 — 를 vuln으로 1:1 전이하는
+> 것이 모든 blocker의 공통 해법이다."
+
+종합이 짚은 핵심: **vuln blocker/major의 거의 전부가 시크릿 트랙(PR #58)이 review로 이미 잡고 해소한
+항목의 vuln 평행**이다. 따라서 v2의 골격은 시크릿 design v2 구조의 1:1 전이다. 단 한 가지 vuln 고유
+악화 요인이 있다 — **M3 durable disposition은 시크릿과 달리 vuln finding state가 durable store에 아예
+없어**(시크릿은 이미 배선), 자율 범위가 시크릿보다 **좁아진다**(durable disposition을 자율층에서 빼고
+H-track으로 연기).
+
+## 심각도 집계
+
+| 심각도 | 건수 | 비고 |
+| --- | --- | --- |
+| blocker | 6 | autopilot-fit 3 · codebase-arch 1 · security-publicsafety 2 |
+| major | 13 | requirements-fidelity 2 · autopilot-fit 3 · measurement 5 · security 3 |
+| minor | 8 | 명세 보강 |
+| nit | 4 | 표기/가독성 |
+
+적대적 검증이 조정한 severity도 기록: `precision-recall-primitive-does-not-exist`는 blocker 주장 →
+**major로 하향**(vuln에는 `core/vulnerability/evaluation.py`가 이미 precision/recall을 구현, 제3 엔진
+신설이 처음부터는 아님). `ARCH-VULN-03`/`ARCH-VULN-05`/`vuln-snapshot-path-redaction`/
+`vuln-llm-input-leak`은 코드가 이미 안전하거나 design이 부분 인지 → minor/nit로 하향. 코드 근거가
+탄탄한 리뷰.
+
+## blocker (6) — v2 반영
+
+| id | 차원 | 문제 | v2 해소 |
+| --- | --- | --- | --- |
+| `AP-03`/`VD-01` | autopilot | 'Autopilot Execution Shape' 섹션 전체 누락(goal_id/execution_mode/allowed_writes/acceptance_checks/stop_conditions/SoT승격 0건) → goal-setup 즉시 차단 | §A 신설(시크릿 §Autopilot Execution Shape 1:1 전이): goal_id=`ghas-quality-vuln-parity`, long-single-goal/stop-conditions-only/PR, allowed_writes 화이트리스트, acceptance_checks(phase-2a base+diff), stop_conditions 정본+vuln 유효분, goal-setup 3파일 동시 갱신 |
+| `vuln-sot-path-gitignored-gate-blind`/`VD-01` | security | SoT가 `.claude/specs`(`.gitignore:72`)에 있어 autopilot_gate가 outside-allowed_writes로 차단·public_safety 스캔 불가 | §A: SoT를 `docs/workbench/specs/ghas-quality-vuln-subtrack/`로 git 승격(커밋본만), grill 원본만 .claude 잔존. allowed_writes에 그 docs 경로, acceptance_checks에 `public_safety --path` 추가. M0 산출물 명시 |
+| `vuln-governance-wildcard-self-modify` | security | allowed_writes `governance/**` 광역(현 autopilot_goal.yml:27) 답습 시 autopilot이 stop_conditions·autopilot_gate.py·public_safety.py 자율 수정 | §A: `governance/**` 광역 금지, vuln 전용 게이트 `governance/vuln_parity_slo.py` **단일** 화이트리스트. 3파일 자율수정 금지 Fixed decision. 시크릿 `parity_slo.py`와 분리(별도 파일)를 §5 정합표에 명시 |
+| `AP-01` | autopilot | M4 live-fetch가 자율 M-시퀀스 중간에 박혀 M5/M6/M7 done이 human snapshot에 종속 → 시크릿 2층 분리 폐기 | §B: 자율층(M1~M3)/H-track(H1~H4) 2층 분리. M4 live-fetch를 H-track으로 이동. 자율 goal done = 합성 회귀 게이트 enforce + report-only parity 배선(synthetic fixture 증명)까지로 §0/§8 재작성. PR merge 시 CURRENT.md "parity SLO enforce 미달성, H-track 대기" 규약 |
+| `AP-02`/`ARCH-VULN-01` | autopilot/arch | M3 disposition '1안=set_finding_disposition 재사용'이 코드 전제 위배(store.py:998-1000 FINDING_STATE 부재 시 ValueError, VulnerabilityFinding nosql store 적재 0건) → storage-projection stop-condition 직격 | §C: M3 durable disposition을 **자율층에서 제거**, H-track으로 재분류. v1 자율 vuln verifier는 기존 throwaway JSONL 동작 유지(durable 아님). §4.3에 1안·2안 모두 storage-projection을 친다고 못박음(same-blocker 반복 차단) |
+
+## major (13) — v2 반영
+
+| id | 차원 | 문제 | v2 해소 |
+| --- | --- | --- | --- |
+| `VD-02`/`precision-recall-primitive` | requirements/measure | parity 매처가 신규 precision/recall 경로 신설 락인 부재(`core/vulnerability/evaluation.py`가 이미 제3 엔진으로 존재) | §D: M1 done 인변 "신규 precision/recall·gate 계산 코드 0줄, `core/vulnerability/evaluation.py` 재사용, CodeScanAlertRecord→VulnerabilityEvaluationKey 어댑터로만 수렴". §5 분기표에 합성 evaluate와 parity 매처가 같은 계산 코어 공유 명시(제4 엔진 차단). coverage≠precision/recall 의미 분리 |
+| `VD-03` | requirements | 합성 회귀 게이트(VulnerabilityEvaluationKey: file+line_start+rule_id 완전일치 — 코드 확인)와 V-Q2 rule-class 정규화 모순 | §E: 두 게이트 모두 rule-class 정규화+line-window EvaluationKey 의미론 채택(권장안). expected-findings 스키마·정규화 적용 지점 §4.2 고정. VFR8 정합 조건 명시 |
+| `AP-04`/`vuln-active-goal-slot-eviction` | autopilot/security | active_goal 슬롯이 personal-prod-deploy 점유(current.yml:40)인데 슬롯 경합/default-off 머지 경로 미판정 | §J: vuln 자율 코드는 시크릿 패턴대로 active_goal 슬롯 없이 governance 3파일 main(theirs) 채택해 머지. 실제 슬롯 점유 전환은 사용자 결정(stop/escalate). goal-setup 3파일 동시 갱신 절차 §A |
+| `AP-05`/`vuln-existing-scan-default-invariance` | autopilot/security | M2 인라인 gating default-on 여부·scan-vuln 기존 출력 불변·stop-condition 관계 미판정 | §K: 결정적·메타데이터-only·억제율 회귀로 보장되는 부분만 default-on, 동작 바꾸는 신규 rule 억제(code_flow gating·저신뢰 rule 신규 억제)는 gated. default-on이 합성 recall≥0.99(canary TP 보존) 안 깸을 M2 done 인변. 기존 scan-vuln default 출력 불변 인변 |
+| `AP-06` | autopilot | vuln design이 멀티에이전트 리뷰 미수행인데 §0·§3이 자율 시퀀싱을 기정사실로 기술 | 본 review.md 산출로 해소. v2가 blocker/major 전부 반영. §0 단정을 "리뷰 반영 v2" 전제로 수정 |
+| `synthetic-self-fulfilling`/`vuln-synthetic-fixture-self-fulfilling` | measure/security | 합성 비중↑인데 self-fulfilling 방어(적대적 fixture)가 시크릿보다 약함(§7 한 줄) | §F: M1 done에 정규화/line-window/필터 누락이 red가 되는 적대적 fixture(CWE-부재 rule-token-only, source/sink 라인 드리프트, CodeQL↔Semgrep 동일취약 다른 rule.id, dismissed_reason 케이스) 명시. M2/M7 done에 독립 작성 적대 쌍 회귀 누락 red 추가. 합성↔실 snapshot divergence 보고를 H-track baseline done에 |
+| `cwe-intersection-asymmetry-recall-inflation` | measure | CWE-교집합 매칭 (a)다대다 충돌 (b)CWE 결손 비대칭 (c)`start_line//N` 양자화가 ±N 윈도 의도와 모순 | §G: §4.2에 같은 윈도 내 CWE 다대다 1:1 greedy 최대매칭(또는 CWE 계층 일치), `|alert_line−finding_line|≤N` 진짜 윈도로 정의·N 확정, CWE 결손률·by-rule-token 구제율을 baseline 1급 메타 노출+신뢰도 경고 |
+| `dismissed-reason-snapshot-survivorship-bias` | measure | precision/recall 분모를 state별로 미공식화(시크릿 `alert-state-not-filtered`의 vuln 평행) | §H: recall 분모=open+fixed CodeQL alert만, precision 페널티=dismissed(fp/used-in-tests) 별도 누적으로 §4.2/M1 done 수식 고정. dismissed repo별 밀도 편향 §7 리스크. fixed alert location staleness §4.1 검증 |
+| `finding-id-not-stable-across-rule-normalization` | measure | disposition 영속 키 finding_id가 fallback에서 `{rule_id,file_path,line_start,message}` 해시(코드 확인) → 정규화·drift·message 변경 시 유실 | §I: 안정 키 정의 또는 Semgrep-compat 출력에 stable partialFingerprints 강제를 §4.3에서 명시 결정. **단 durable disposition이 §C로 H-track 연기되므로 이 항목은 H-track 전제로 명시** |
+
+## minor (8) — v2 반영 요지
+
+- `VD-04`/`AP-08` (`report-only-enforce-unreachable`): vuln parity 게이트를 'threshold 부재→report-only,
+  존재→enforce(H-track baseline 후)' 자동 분기. 자율 goal done을 '합성 회귀 enforce + parity report-only
+  배선'까지로 축소(§B). snapshot 나이>임계 stale-degraded(silent pass 금지) 전이(§4.4).
+- `VD-05`: M4 'compare-ghas --category code-vuln' 대안이 사실 불일치(`cmd_compare_ghas`는
+  `secret-scanning/alerts` 하드와이어·`--category` 미등록 — 확인) → 그 선택지 삭제, 신규
+  `compare-codescan`을 기본으로 고정(§H1 작업). V-Q8 정합.
+- `VD-06`: M3/§4.3 disposition durable 배선이 storage projection stop-condition 미명시 → §A
+  stop_conditions에 `storage-projection-or-schema-migration-required` 포함, §C에서 2안(vuln 전용
+  파티션)이 storage projection에 해당함을 판정.
+- `codeql-only-oracle-language-coverage-bias`: CodeQL 미지원 언어(PHP/Bash/IaC) SLO 제외가 baseline
+  대표성 비공개 편향 → baseline 리포트(H-track)에 'parity 측정 대상 언어 비중 / C-monitor-only 비중'을
+  1급 메타 노출(VFR3 추가). goal done에 '측정 모집단=CodeQL 지원 언어' 범위 한정자 명시(§G·§8).
+- `rule-token-fallback-spec-underdefined`: by-rule-token 부분일치 술어 미정의 → §4.2에 stop-token
+  제거 후 핵심 취약-클래스 토큰 정확 집합 일치만 매칭(부분 겹침 금지). CWE 브리지 매핑 테이블 확장을
+  M1 작업으로 박아 by-rule-token 의존 최소화.
+- `vuln-snapshot-path-redaction-unspecified`: snapshot location.path 재다션 정책 미명시,
+  public_safety는 절대경로만 잡음(`identifier.private-path` 확인) → §4.1/H1 done에 '실 snapshot은
+  gitignore 사설 경로 보관·커밋 금지(시크릿 이중 차단), synthetic fixture만 커밋, public_safety 통과는
+  보조 검사이며 상대경로 누출은 gitignore가 1차 방어' 명시.
+- `vuln-existing-scan-default-invariance` (인변): 기존 scan default 불변이 design done 미고정 →
+  M2 done에 '기존 노출 finding 무단 억제 안 됨' 합성 회귀로 고정(§K).
+- `VD-07`: M2 '≥N개 취약 클래스'·line-window N placeholder → M2 done에 최소 클래스 집합 명시 나열
+  (SQLi/XSS/path-traversal/command-injection/SSRF 5종 고정), line-window N은 'M1에서 확정' 시점 명시(§G).
+
+## nit — v2 반영 요지
+
+- `ARCH-VULN-03`: M4 fetch 재사용/신규 표면 분리 명시(재사용=GhApiRunner GET-only 가드·페이지네이션·
+  redaction 헬퍼, 신규=CodeScanAlertRecord·정규화·compare-codescan). §H1·§5에 반영.
+- `ARCH-VULN-05`: triage_state↔Verdict 어휘 일치 ≠ durable 반영 — NEEDS_REVIEW는 durable 미기록.
+  §C(H-track durable)에서 종단 verdict만 반영·NEEDS_REVIEW 무기록 명시.
+- `vuln-llm-input-leak-surface-verified-ok`: 현 LLM 입력 redaction 견고(코드 확인) → M2/M6 done에
+  '신규 verifier 입력은 redacted-metadata 계약 준수(trace는 count/shape만, related_location path 평문
+  금지)' 인변(§K).
+- `precision-recall-primitive` 잔여: GhasComparisonResult(coverage)와 VulnerabilityEvaluationResult
+  (precision/recall) 의미 분리를 §4.2/§5에 명시(§D와 통합).
+
+## 판정
+
+design.md v2는 **blocker 6 · major 13 전부 반영**. 잔여는 구현 중 해소할 Open Questions(CWE 브리지
+매핑 커버리지, line-window k값 확정, rule-token 술어 정밀화, 합성 코퍼스 클래스 확장). **goal-setup
+진행 가능**, 단 goal-setup이 (1) SoT를 `docs/workbench/specs/ghas-quality-vuln-subtrack/`로 승격,
+(2) allowed_writes/acceptance_checks/stop_conditions를 phase-2a 템플릿 기준 + §A diff로 작성,
+(3) autopilot_goal.yml goal_id·current.yml active_goal·CURRENT.md를 한 커밋에 동시 갱신해야 함.
+
+**핵심 변경 요지**: M3 durable disposition 재분류로 **자율 범위가 시크릿보다 좁아졌다.** 시크릿은
+disposition이 이미 durable 배선돼 있어 M3(LLM 티어 disposition)이 자율층에 들어갔지만, vuln은 finding
+state가 durable store에 아예 없어(store.py 참조 0건) durable disposition을 만들려면 storage projection
+신규 = stop-condition. 따라서 vuln 자율 goal done = **인라인 싼 티어 + 합성 회귀 게이트 + report-only
+parity 배선까지**이고, durable disposition(vuln verdict 영속)·실 GHAS fetch·baseline·실 parity enforce는
+**모두 H-track**으로 분리된다.
diff --git a/eval/codescan-parity-corpus/README.md b/eval/codescan-parity-corpus/README.md
new file mode 100644
index 0000000..a19aff5
--- /dev/null
+++ b/eval/codescan-parity-corpus/README.md
@@ -0,0 +1,56 @@
+# codescan-parity-corpus
+
+Synthetic, fully fake adversarial corpus for the GHAS **code-scanning** parity
+matcher (`core/vulnerability/codescan_parity.py`). This is the vuln-domain analog
+of `eval/ghas-parity-corpus/`.
+
+## Provenance fail-closed
+
+`synthetic-snapshot.json` carries the top-level provenance marker
+`"source": "synthetic"`. `load_codescan_snapshot` refuses to load any snapshot
+whose `source` is not exactly `synthetic`, so a real (or unmarked) snapshot can
+never feed the autonomous harness. ZERO network: the matcher and loader are pure
+logic over redacted records.
+
+## All values are fake
+
+No real repository names, file paths, code snippets, secret material, or rule
+taxonomies. Paths are synthetic relatives under `synthetic_app/`. The CodeQL-style
+and Semgrep-style `rule.id` tokens are representative shapes, not copied content.
+
+## Adversarial cases (design §2 M1)
+
+The snapshot is engineered so that switching OFF one matcher responsibility turns
+a specific metric red:
+
+- **(a) CWE-absent rule-token-only** — alert `js/path-injection` (no CWE) vs
+  finding `javascript.lang.security.audit.path-traversal` (no CWE). Matches only
+  via the by-rule-token fallback (`path-injection` folds to `path` + `traversal`).
+- **(b) source/sink line drift** — alert at line 24, finding at line 26 (`+2 == N`),
+  just inside the fixed line window `N = 2`. Matches only with the window.
+- **(c) CodeQL ↔ Semgrep same vuln, different rule.id** — alert `py/sql-injection`
+  (CWE-89) vs finding `python.lang.security.audit.sql-injection` (CWE-89). Matches
+  by-cwe. Without the CWE bridge AND without rule-token it splits into FP + FN.
+- **(d) dismissed_reason cases** — a `dismissed` / `false positive` alert our
+  finding hits (precision penalty: `dismissed_fp_hit`, and a false positive), plus
+  a `won't fix` alert (TP-non-blocking, excluded from the recall denominator).
+
+## Denominator semantics (design §4.2)
+
+- **Recall denominator** = alerts in `state ∈ {open, fixed}` only.
+- **Precision penalty** = count of `dismissed` / (`false positive` | `used in
+  tests`) locations our finding surfaced (`dismissed_fp_hit`); each is also a
+  false positive.
+- **`won't fix`** = TP-non-blocking: excluded from the recall denominator, not a
+  precision penalty.
+
+The line window `N` is fixed at **2** in M1 (closes open question VD-07) and is
+pinned by the tests in `tests/test_codescan_parity.py`.
+
+## Precision/recall reuse
+
+The matcher synthesizes canonical keys and routes them through
+`core/vulnerability/evaluation.py::evaluate_vulnerability_findings` — there is
+**zero** new precision/recall formula. `result.detection` is a
+`VulnerabilityEvaluationResult` whose `.precision` / `.recall` come straight from
+the reused metrics layer.
diff --git a/eval/codescan-parity-corpus/synthetic-snapshot.json b/eval/codescan-parity-corpus/synthetic-snapshot.json
new file mode 100644
index 0000000..27f08ab
--- /dev/null
+++ b/eval/codescan-parity-corpus/synthetic-snapshot.json
@@ -0,0 +1,96 @@
+{
+  "schemaVersion": 1,
+  "source": "synthetic",
+  "description": "Adversarial synthetic GHAS code-scanning parity snapshot. ALL VALUES ARE FAKE: no real repo names, file paths, code snippets, or rule taxonomies. Synthetic relative paths only (synthetic_app/*). It pairs CodeQL-style alert rule.ids against Semgrep-style finding rule.ids so that a missing CWE bridge, a missing rule-token fallback, a too-narrow line window, or a missing state filter each turn a specific metric red. Adversarial cases: (a) CWE-absent rule-token-only match, (b) source/sink line drift inside the window, (c) CodeQL<->Semgrep same vuln different rule.id matched by-cwe, (d) dismissed_reason cases (false positive = precision penalty, won't fix = TP-non-blocking excluded from recall).",
+  "repoFullName": "synthetic-org/synthetic-codescan-repo",
+  "fetchedAt": "2026-06-21T12:00:00+00:00",
+  "alerts": [
+    {
+      "alertNumber": 1,
+      "ruleId": "py/sql-injection",
+      "securitySeverityLevel": "high",
+      "cweIds": ["CWE-89"],
+      "state": "open",
+      "filePath": "synthetic_app/handlers.py",
+      "lineStart": 10,
+      "lineEnd": 10,
+      "note": "(c) CodeQL rule.id py/sql-injection (CWE-89). Our finding uses a different Semgrep-style rule.id but the same CWE -> matches by-cwe. Without the CWE bridge AND without rule-token, it splits into FP+FN."
+    },
+    {
+      "alertNumber": 2,
+      "ruleId": "py/xss",
+      "securitySeverityLevel": "medium",
+      "cweIds": ["CWE-79"],
+      "state": "open",
+      "filePath": "synthetic_app/render.py",
+      "lineStart": 24,
+      "lineEnd": 24,
+      "note": "(b) source/sink line drift: our finding sits at line 26 (+2 == N), just inside the line window. Matches only with the window."
+    },
+    {
+      "alertNumber": 3,
+      "ruleId": "js/path-injection",
+      "securitySeverityLevel": "high",
+      "cweIds": [],
+      "state": "fixed",
+      "filePath": "synthetic_app/files.js",
+      "lineStart": 40,
+      "lineEnd": 40,
+      "note": "(a) CWE-absent rule-token-only: neither side carries a CWE. Matches only via the rule-token fallback (path-injection folds to path+traversal). state=fixed -> positive truth (recall denominator)."
+    },
+    {
+      "alertNumber": 4,
+      "ruleId": "py/xss",
+      "securitySeverityLevel": "medium",
+      "cweIds": ["CWE-79"],
+      "state": "dismissed",
+      "dismissedReason": "false positive",
+      "filePath": "synthetic_app/legacy.py",
+      "lineStart": 55,
+      "lineEnd": 55,
+      "note": "(d) dismissed false positive (FP-oracle). Our finding lands here, so it is a precision penalty (dismissed_fp_hit) and a false positive. Excluded from the recall denominator -> disabling the state filter makes this an undetected FN and drops recall."
+    },
+    {
+      "alertNumber": 5,
+      "ruleId": "py/command-injection",
+      "securitySeverityLevel": "high",
+      "cweIds": ["CWE-78"],
+      "state": "dismissed",
+      "dismissedReason": "won't fix",
+      "filePath": "synthetic_app/ops.py",
+      "lineStart": 70,
+      "lineEnd": 70,
+      "note": "(d) won't fix (TP-non-blocking). We do NOT detect it. Excluded from the recall denominator and not a precision penalty."
+    }
+  ],
+  "findings": [
+    {
+      "ruleId": "python.lang.security.audit.sql-injection",
+      "sourceTool": "semgrep",
+      "cweIds": ["CWE-89"],
+      "filePath": "synthetic_app/handlers.py",
+      "lineStart": 10
+    },
+    {
+      "ruleId": "py/xss",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-79"],
+      "filePath": "synthetic_app/render.py",
+      "lineStart": 26
+    },
+    {
+      "ruleId": "javascript.lang.security.audit.path-traversal",
+      "sourceTool": "semgrep",
+      "cweIds": [],
+      "filePath": "synthetic_app/files.js",
+      "lineStart": 40
+    },
+    {
+      "ruleId": "py/xss",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-79"],
+      "filePath": "synthetic_app/legacy.py",
+      "lineStart": 55
+    }
+  ]
+}
diff --git a/eval/synthetic-code-vuln/corpus-snapshot.json b/eval/synthetic-code-vuln/corpus-snapshot.json
new file mode 100644
index 0000000..5886cd2
--- /dev/null
+++ b/eval/synthetic-code-vuln/corpus-snapshot.json
@@ -0,0 +1,123 @@
+{
+  "schemaVersion": 1,
+  "source": "synthetic",
+  "name": "synthetic-code-vuln-5class-corpus",
+  "description": "Synthetic, fully fake 5-class code-vuln regression corpus (design VD-07). ALL VALUES ARE FAKE: no real repo names, file paths, code snippets, or rule taxonomies. Synthetic relative paths only (synthetic_app/*). Covers SQLi (CWE-89), XSS (CWE-79), path-traversal (CWE-22), command-injection (CWE-78), SSRF (CWE-918). Each class has a vulnerable case (expectedFindings) and a safe case (safeCases, must NOT be flagged -> exercises precision). actualFindings deliberately mix CodeQL-style and Semgrep-style rule.ids so only the normalization-aware path (RuleClassNormalizer + line-window, shared with the M1 parity matcher) matches them. Safe-case findings are intentionally absent from actualFindings, so a precision-correct scanner that does not flag them keeps precision high.",
+  "lineWindow": 2,
+  "expectedFindings": [
+    {
+      "vulnClass": "sql-injection",
+      "filePath": "synthetic_app/handlers.py",
+      "lineStart": 42,
+      "ruleId": "python.lang.security.audit.sql-injection",
+      "cweIds": ["CWE-89"],
+      "note": "Expected uses a Semgrep-style rule.id; the matching actual finding uses a CodeQL-style py/sql-injection. Matches by-cwe via the shared normalizer (cross-dialect)."
+    },
+    {
+      "vulnClass": "xss",
+      "filePath": "synthetic_app/render.py",
+      "lineStart": 18,
+      "ruleId": "python.lang.security.audit.xss",
+      "cweIds": ["CWE-79"],
+      "note": "Cross-dialect: actual is CodeQL py/reflected-xss carrying CWE-79; matches by-cwe."
+    },
+    {
+      "vulnClass": "path-traversal",
+      "filePath": "synthetic_app/files.py",
+      "lineStart": 30,
+      "ruleId": "python.lang.security.audit.path-traversal",
+      "cweIds": [],
+      "note": "No CWE on either side: matches only via the rule-token fallback (path-injection folds to path+traversal). Actual finding sits at line 32 (+2 == N), inside the line window."
+    },
+    {
+      "vulnClass": "command-injection",
+      "filePath": "synthetic_app/ops.py",
+      "lineStart": 55,
+      "ruleId": "python.lang.security.audit.command-injection",
+      "cweIds": ["CWE-78"],
+      "note": "Cross-dialect: actual is CodeQL py/command-line-injection with CWE-78; matches by-cwe."
+    },
+    {
+      "vulnClass": "ssrf",
+      "filePath": "synthetic_app/fetch.py",
+      "lineStart": 12,
+      "ruleId": "python.lang.security.audit.ssrf",
+      "cweIds": ["CWE-918"],
+      "note": "Cross-dialect: actual is CodeQL py/request-forgery with CWE-918; matches by-cwe."
+    }
+  ],
+  "actualFindings": [
+    {
+      "vulnClass": "sql-injection",
+      "filePath": "synthetic_app/handlers.py",
+      "lineStart": 42,
+      "ruleId": "py/sql-injection",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-89"]
+    },
+    {
+      "vulnClass": "xss",
+      "filePath": "synthetic_app/render.py",
+      "lineStart": 18,
+      "ruleId": "py/reflected-xss",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-79"]
+    },
+    {
+      "vulnClass": "path-traversal",
+      "filePath": "synthetic_app/files.py",
+      "lineStart": 32,
+      "ruleId": "javascript.lang.security.audit.path-traversal",
+      "sourceTool": "semgrep",
+      "cweIds": []
+    },
+    {
+      "vulnClass": "command-injection",
+      "filePath": "synthetic_app/ops.py",
+      "lineStart": 55,
+      "ruleId": "py/command-line-injection",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-78"]
+    },
+    {
+      "vulnClass": "ssrf",
+      "filePath": "synthetic_app/fetch.py",
+      "lineStart": 12,
+      "ruleId": "py/request-forgery",
+      "sourceTool": "codeql",
+      "cweIds": ["CWE-918"]
+    }
+  ],
+  "safeCases": [
+    {
+      "vulnClass": "sql-injection",
+      "filePath": "synthetic_app/safe_handlers.py",
+      "lineStart": 40,
+      "note": "Parameterized query: not vulnerable. A precision-correct scanner does NOT flag this -> intentionally absent from actualFindings."
+    },
+    {
+      "vulnClass": "xss",
+      "filePath": "synthetic_app/safe_render.py",
+      "lineStart": 20,
+      "note": "Auto-escaped template output: not vulnerable."
+    },
+    {
+      "vulnClass": "path-traversal",
+      "filePath": "synthetic_app/safe_files.py",
+      "lineStart": 28,
+      "note": "Path is validated against an allowlist: not vulnerable."
+    },
+    {
+      "vulnClass": "command-injection",
+      "filePath": "synthetic_app/safe_ops.py",
+      "lineStart": 50,
+      "note": "Fixed argv list, no shell: not vulnerable."
+    },
+    {
+      "vulnClass": "ssrf",
+      "filePath": "synthetic_app/safe_fetch.py",
+      "lineStart": 10,
+      "note": "URL host validated against an allowlist: not vulnerable."
+    }
+  ]
+}
diff --git a/governance/autopilot_goal.yml b/governance/autopilot_goal.yml
index 81f5ce2..4951843 100644
--- a/governance/autopilot_goal.yml
+++ b/governance/autopilot_goal.yml
@@ -1,5 +1,5 @@
 schema_version: 1
-goal_id: personal-prod-deploy
+goal_id: ghas-quality-vuln-parity
 execution_mode:
   style: long-single-goal
   human_gate: stop-conditions-only
@@ -15,16 +15,14 @@ policy_decisions:
   fork_prs: blocked-or-skipped-before-secrets
   public_artifacts: synthetic-or-redacted-only
 allowed_writes:
-  - docs/workbench/specs/phase-2a-sarif-native-sast/**
-  - docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md
+  - docs/workbench/specs/ghas-quality-vuln-subtrack/**
+  - docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
   - docs/views/research-and-technical-decisions.md
   - src/security_scanner/**
   - tests/**
-  - deploy/systemd/user/**
   - examples/**
   - eval/**
-  - docs/workbench/**
-  - governance/**
+  - governance/vuln_parity_slo.py
   - ledger/**
   - CURRENT.md
 acceptance_checks:
@@ -38,7 +36,8 @@ acceptance_checks:
   - uv run python -m governance.rebuild_ledger_index --check
   - uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check
   - uv run python -m governance.public_safety --diff origin/main...HEAD
-  - uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md
+  - uv run python -m governance.public_safety --path docs/workbench/specs/ghas-quality-vuln-subtrack
+  - uv run python -m governance.vuln_parity_slo --check
   - uv run python -m governance.autopilot_gate --base origin/main
 stop_conditions:
   - public-safety-hit
diff --git a/governance/current.yml b/governance/current.yml
index 1ea16e8..d8177d5 100644
--- a/governance/current.yml
+++ b/governance/current.yml
@@ -37,7 +37,7 @@ gates:
   proof_ref: ''
   proof_hash: ''
 autopilot:
-  active_goal: personal-prod-deploy
+  active_goal: ghas-quality-vuln-parity
   merge_mode: guarded-auto-merge
   last_auto_merge: ledger:20260617T003405Z-autopilot-3236f4
 open_decisions: []
diff --git a/governance/vuln_parity_slo.py b/governance/vuln_parity_slo.py
new file mode 100644
index 0000000..1808bff
--- /dev/null
+++ b/governance/vuln_parity_slo.py
@@ -0,0 +1,377 @@
+"""GHAS code-scanning parity SLO gate (M3) — report-only until a threshold exists.
+
+This gate measures our code-vulnerability detector's per-repo GHAS *parity*
+against frozen **synthetic** code-scanning snapshot fixtures and reports macro
+precision/recall. It is the autonomous-layer CI vehicle for the
+``ghas-quality-vuln-parity`` goal — the 1:1 vuln-domain transfer of the proven
+secret-track gate ``governance/parity_slo.py``.
+
+Two-mode by design (design.md §4.4 / §2 M3, requirements measure-first):
+
+* **report-only** — the default and the ONLY mode reachable autonomously: when no
+  threshold file exists (or it is empty), the gate prints the measured numbers and
+  ALWAYS exits 0. It never blocks. The real, calibrated thresholds are committed
+  only after the human-gated H1~H3 track measures a real CodeQL baseline, so until
+  then there is nothing legitimate to enforce.
+* **enforce** — reachable only once a human commits a threshold file: macro
+  precision/recall below the committed minimums fail the gate (exit 1). This is
+  the measure-first auto-branch (threshold present => enforce).
+
+Staleness is surfaced, never silently passed (design ``staleness-passive-only``,
+scan-health precedent): a snapshot older than the max age is reported as
+``stale-degraded``. In report-only that is a visible warning (exit 0); in enforce
+it fails (exit 1) so a stale snapshot cannot silently satisfy the gate. A snapshot
+with no parseable ``fetched_at`` is treated as stale (unknown age must not pass).
+
+Inputs are SYNTHETIC fixtures only. ``core.vulnerability.codescan.load_codescan_
+snapshot`` fails closed unless the snapshot carries ``source: synthetic``
+provenance, so a real GHAS code-scanning export can never drive this gate.
+
+Computation reuse: per-repo precision/recall come straight from the M1 parity
+matcher (``core.vulnerability.codescan_parity.compare_codescan_alerts_with_
+findings``), whose ``.detection`` is the metrics-layer
+``core.vulnerability.evaluation`` result. This module adds NO new precision/recall
+formula — it only loads snapshots, macro-aggregates
+(``aggregate_codescan_parity`` — averaging, not a TP/(TP+FP) re-derivation), reads
+an optional threshold, and judges report-only vs enforce vs stale.
+"""
+
+from __future__ import annotations
+
+import argparse
+import datetime as dt
+import json
+import sys
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+import yaml
+
+from security_scanner.core.vulnerability.codescan import (
+    RuleClassNormalizer,
+    load_codescan_snapshot,
+)
+from security_scanner.core.vulnerability.codescan_parity import (
+    MacroCodeScanParityResult,
+    aggregate_codescan_parity,
+    compare_codescan_alerts_with_findings,
+)
+
+# The committed M1 synthetic code-scanning snapshot fixture dir (mirrors the
+# secret gate's DEFAULT_SNAPSHOT_DIR = eval/ghas-parity-corpus).
+DEFAULT_SNAPSHOT_DIR = Path("eval/codescan-parity-corpus")
+# A SEPARATE threshold file from the secret gate's, so the two tracks never
+# collide. It does NOT exist in the autonomous layer => report-only.
+DEFAULT_THRESHOLD_PATH = Path("governance/vuln_parity_slo_thresholds.yml")
+
+# A snapshot older than this is reported as stale-degraded. Synthetic fixtures
+# have no real freshness obligation, so the default is generous; the real cadence
+# SLA is set by the human-gated H-track.
+DEFAULT_MAX_SNAPSHOT_AGE_DAYS = 90
+
+
+@dataclass(frozen=True)
+class VulnParitySloThresholds:
+    """Calibrated minimums. Absent until the human-gated H-track commits them."""
+
+    precision_min: float
+    recall_min: float
+
+
+@dataclass(frozen=True)
+class VulnParitySloResult:
+    """Outcome of one vuln parity-SLO evaluation pass."""
+
+    mode: str  # "report-only" | "enforce"
+    macro: MacroCodeScanParityResult
+    snapshot_count: int
+    stale: bool
+    stale_snapshots: tuple[str, ...]
+    thresholds: VulnParitySloThresholds | None
+    failures: tuple[str, ...]
+
+    @property
+    def total_dismissed_fp_hit(self) -> int:
+        return self.macro.total_dismissed_fp_hit
+
+    @property
+    def passed(self) -> bool:
+        """Whether the gate should exit 0.
+
+        report-only never blocks (exit 0 even when stale or below target — there
+        is no committed target to enforce yet). enforce blocks on any failure,
+        including a stale snapshot (staleness must not silently pass).
+        """
+        if self.mode == "report-only":
+            return True
+        return not self.failures
+
+
+def load_thresholds(path: Path) -> VulnParitySloThresholds | None:
+    """Load calibrated thresholds, or None when absent/empty (report-only)."""
+    if not path.exists():
+        return None
+    raw = path.read_text(encoding="utf-8").strip()
+    if not raw:
+        return None
+    data = yaml.safe_load(raw)
+    if not isinstance(data, dict) or not data:
+        return None
+    try:
+        precision_min = float(data["precision_min"])
+        recall_min = float(data["recall_min"])
+    except (KeyError, TypeError, ValueError) as exc:
+        raise ValueError(
+            "vuln_parity_slo thresholds must define numeric precision_min and "
+            "recall_min"
+        ) from exc
+    return VulnParitySloThresholds(
+        precision_min=precision_min, recall_min=recall_min
+    )
+
+
+def discover_snapshots(snapshot_dir: Path) -> list[Path]:
+    """Return committed synthetic snapshot fixture files (sorted, deterministic)."""
+    if not snapshot_dir.exists():
+        return []
+    return sorted(snapshot_dir.glob("*snapshot*.json"))
+
+
+def _snapshot_is_stale(
+    fetched_at: str | None, *, now: dt.datetime, max_age_days: int
+) -> bool:
+    """True when the snapshot's fetched_at is older than the max age.
+
+    A snapshot with no parseable fetched_at is treated as stale (unknown age must
+    not silently pass — design staleness-passive-only).
+    """
+    if not fetched_at:
+        return True
+    parsed = _parse_timestamp(fetched_at)
+    if parsed is None:
+        return True
+    age = now - parsed
+    return age > dt.timedelta(days=max_age_days)
+
+
+def _parse_timestamp(value: str) -> dt.datetime | None:
+    text = value.strip()
+    if text.endswith("Z"):
+        text = text[:-1] + "+00:00"
+    try:
+        parsed = dt.datetime.fromisoformat(text)
+    except ValueError:
+        return None
+    if parsed.tzinfo is None:
+        parsed = parsed.replace(tzinfo=dt.timezone.utc)
+    return parsed
+
+
+def evaluate_vuln_parity_slo(
+    *,
+    snapshot_dir: Path = DEFAULT_SNAPSHOT_DIR,
+    threshold_path: Path = DEFAULT_THRESHOLD_PATH,
+    now: dt.datetime | None = None,
+    max_age_days: int = DEFAULT_MAX_SNAPSHOT_AGE_DAYS,
+) -> VulnParitySloResult:
+    """Measure macro parity over synthetic snapshots and judge the SLO mode."""
+    now = now or dt.datetime.now(dt.timezone.utc)
+    thresholds = load_thresholds(threshold_path)
+    mode = "enforce" if thresholds is not None else "report-only"
+
+    normalizer = RuleClassNormalizer()
+    snapshot_paths = discover_snapshots(snapshot_dir)
+
+    repo_results = []
+    stale_snapshots: list[str] = []
+    for path in snapshot_paths:
+        # load_codescan_snapshot fails closed on non-synthetic provenance.
+        snapshot = load_codescan_snapshot(path)
+        if _snapshot_is_stale(
+            snapshot.fetched_at, now=now, max_age_days=max_age_days
+        ):
+            stale_snapshots.append(path.name)
+        repo_results.append(
+            compare_codescan_alerts_with_findings(
+                repository=snapshot.repo_full_name,
+                alerts=snapshot.alerts,
+                findings=snapshot.findings,
+                normalizer=normalizer,
+            )
+        )
+
+    macro = aggregate_codescan_parity(repo_results)
+    stale = bool(stale_snapshots)
+
+    failures: list[str] = []
+    if thresholds is not None:
+        if macro.macro_precision < thresholds.precision_min:
+            failures.append(
+                f"macro precision {macro.macro_precision:.4f} < minimum "
+                f"{thresholds.precision_min:.4f}"
+            )
+        if macro.macro_recall < thresholds.recall_min:
+            failures.append(
+                f"macro recall {macro.macro_recall:.4f} < minimum "
+                f"{thresholds.recall_min:.4f}"
+            )
+        if stale:
+            # In enforce mode a stale snapshot is a hard failure: it must not
+            # silently satisfy the gate.
+            failures.append(
+                "stale-degraded: snapshot(s) older than "
+                f"{max_age_days}d: {', '.join(stale_snapshots)}"
+            )
+
+    return VulnParitySloResult(
+        mode=mode,
+        macro=macro,
+        snapshot_count=len(snapshot_paths),
+        stale=stale,
+        stale_snapshots=tuple(stale_snapshots),
+        thresholds=thresholds,
+        failures=tuple(failures),
+    )
+
+
+def render_report(result: VulnParitySloResult) -> str:
+    """Render a public-safe, aggregate-only vuln parity-SLO report."""
+    macro = result.macro
+    lines = [
+        "GHAS Vuln Code-Scanning Parity SLO",
+        "==================================",
+        f"Mode: {result.mode}",
+        f"Snapshots measured: {result.snapshot_count}",
+        f"Repos: {macro.repo_count}",
+        f"Macro precision: {macro.macro_precision:.4f}",
+        f"Macro recall: {macro.macro_recall:.4f}",
+        f"Matched by-cwe: {macro.total_matched_by_cwe}",
+        f"Matched by-rule-token: {macro.total_matched_by_rule_token}",
+        f"Unmatched: {macro.total_unmatched}",
+        f"Dismissed-FP hit: {macro.total_dismissed_fp_hit}",
+        f"CWE-deficit rate: {macro.macro_cwe_deficit_rate:.4f}",
+        f"Rule-token rescue rate: {macro.macro_rule_token_rescue_rate:.4f}",
+    ]
+    if result.thresholds is not None:
+        lines.append(
+            f"Thresholds: precision_min {result.thresholds.precision_min:.4f}, "
+            f"recall_min {result.thresholds.recall_min:.4f}"
+        )
+    else:
+        lines.append(
+            "Thresholds: none committed (report-only; enforce pending H-track)"
+        )
+    # Surface snapshot age / staleness (NFR, scan-health precedent): always state
+    # the staleness verdict, not only when stale.
+    if result.stale:
+        lines.append(f"Stale-degraded: {', '.join(result.stale_snapshots)}")
+    else:
+        lines.append("Snapshot freshness: OK (within max age)")
+    if result.mode == "report-only":
+        lines.append("Result: REPORT-ONLY (never blocks; measure-first)")
+    elif result.failures:
+        lines.append("Result: FAIL")
+        for failure in result.failures:
+            lines.append(f"  - {failure}")
+    else:
+        lines.append("Result: PASS")
+    return "\n".join(lines) + "\n"
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--root", type=Path, default=Path.cwd())
+    parser.add_argument(
+        "--snapshot-dir",
+        type=Path,
+        default=DEFAULT_SNAPSHOT_DIR,
+        help="directory of committed synthetic code-scanning snapshot fixtures",
+    )
+    parser.add_argument(
+        "--threshold-path",
+        type=Path,
+        default=DEFAULT_THRESHOLD_PATH,
+        help="optional calibrated threshold yml (absent => report-only)",
+    )
+    parser.add_argument(
+        "--max-age-days",
+        type=int,
+        default=DEFAULT_MAX_SNAPSHOT_AGE_DAYS,
+        help="snapshot age beyond which it is stale-degraded",
+    )
+    parser.add_argument(
+        "--check",
+        action="store_true",
+        help="evaluate and report; exit non-zero only in enforce mode failure",
+    )
+    parser.add_argument(
+        "--json", action="store_true", help="emit a machine-readable JSON summary"
+    )
+    args = parser.parse_args(argv)
+
+    root = args.root.resolve()
+    snapshot_dir = (
+        args.snapshot_dir
+        if args.snapshot_dir.is_absolute()
+        else root / args.snapshot_dir
+    )
+    threshold_path = (
+        args.threshold_path
+        if args.threshold_path.is_absolute()
+        else root / args.threshold_path
+    )
+
+    try:
+        result = evaluate_vuln_parity_slo(
+            snapshot_dir=snapshot_dir,
+            threshold_path=threshold_path,
+            max_age_days=args.max_age_days,
+        )
+    except Exception as exc:  # noqa: BLE001 - present any setup/provenance error.
+        print(f"vuln_parity_slo gate setup failed: {exc}", file=sys.stderr)
+        return 1
+
+    if args.json:
+        print(json.dumps(_result_to_dict(result), indent=2, sort_keys=True))
+    else:
+        print(render_report(result))
+
+    if result.passed:
+        return 0
+    for failure in result.failures:
+        print(f"vuln_parity_slo: {failure}", file=sys.stderr)
+    return 1
+
+
+def _result_to_dict(result: VulnParitySloResult) -> dict[str, Any]:
+    macro = result.macro
+    return {
+        "mode": result.mode,
+        "snapshotCount": result.snapshot_count,
+        "repoCount": macro.repo_count,
+        "macroPrecision": macro.macro_precision,
+        "macroRecall": macro.macro_recall,
+        "matchedByCwe": macro.total_matched_by_cwe,
+        "matchedByRuleToken": macro.total_matched_by_rule_token,
+        "unmatched": macro.total_unmatched,
+        "dismissedFpHit": macro.total_dismissed_fp_hit,
+        "cweDeficitRate": macro.macro_cwe_deficit_rate,
+        "ruleTokenRescueRate": macro.macro_rule_token_rescue_rate,
+        "stale": result.stale,
+        "staleSnapshots": list(result.stale_snapshots),
+        "thresholds": (
+            None
+            if result.thresholds is None
+            else {
+                "precisionMin": result.thresholds.precision_min,
+                "recallMin": result.thresholds.recall_min,
+            }
+        ),
+        "failures": list(result.failures),
+        "passed": result.passed,
+    }
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/src/security_scanner/core/vulnerability/__init__.py b/src/security_scanner/core/vulnerability/__init__.py
index d71e2aa..6220b1b 100644
--- a/src/security_scanner/core/vulnerability/__init__.py
+++ b/src/security_scanner/core/vulnerability/__init__.py
@@ -1,12 +1,15 @@
 """Code vulnerability finding model and SARIF-first helpers."""
 
 from security_scanner.core.vulnerability.evaluation import (
+    NormalizedExpectedFinding,
     VulnerabilityEvaluationKey,
     VulnerabilityEvaluationResult,
     VulnerabilityEvaluationThresholds,
     VulnerabilityExpectedFinding,
     evaluate_vulnerability_findings,
+    evaluate_vulnerability_findings_normalized,
     evaluate_vulnerability_gate,
+    load_vulnerability_corpus_normalized,
     load_vulnerability_expected_findings,
     render_vulnerability_evaluation_report,
 )
@@ -34,6 +37,7 @@
     "VULN_CATEGORY",
     "VULN_ENTITY_TYPE",
     "VULN_SCHEMA_VERSION",
+    "NormalizedExpectedFinding",
     "SarifImportError",
     "VulnerabilityEvaluationKey",
     "VulnerabilityEvaluationResult",
@@ -45,10 +49,12 @@
     "VulnerabilityLocation",
     "compute_vulnerability_finding_id",
     "evaluate_vulnerability_findings",
+    "evaluate_vulnerability_findings_normalized",
     "evaluate_vulnerability_gate",
     "evaluate_vulnerability_gate_policy",
     "import_sarif_file",
     "import_sarif_payload",
+    "load_vulnerability_corpus_normalized",
     "load_vulnerability_expected_findings",
     "render_vulnerability_evaluation_report",
     "render_vulnerability_report",
diff --git a/src/security_scanner/core/vulnerability/codescan.py b/src/security_scanner/core/vulnerability/codescan.py
new file mode 100644
index 0000000..bb997fc
--- /dev/null
+++ b/src/security_scanner/core/vulnerability/codescan.py
@@ -0,0 +1,342 @@
+"""GHAS code-scanning alert domain model + rule-class normalizer (M1).
+
+This is the vuln-domain analog of the proven secret-track artifacts
+(``baseline/ghas_api/normalize.py`` + the ``GhasAlertRecord`` value object). It
+turns a GHAS *code-scanning* alert into a redacted value object
+(:class:`CodeScanAlertRecord`) and collapses a tool's ``rule_id`` / CWE tags onto
+a single canonical vuln class so a CodeQL/Semgrep token-mismatch no longer splits
+one vulnerability across ``local_only`` (precision penalty) and ``ghas_only``
+(recall penalty).
+
+The matcher in :mod:`security_scanner.core.vulnerability.codescan_parity`
+performs the fuzzy (line-window) join on top of this; the precision/recall
+*formula* and gate *threshold* judgement stay in
+``core.vulnerability.evaluation`` (no new metric code here).
+
+:class:`CodeScanAlertRecord` is a PURE leaf value object kept in
+``core/vulnerability/``: it has NO coupling to the durable nosql store. Wiring
+the alert snapshot into a durable projection is the H4 storage-projection trap
+and is deliberately out of scope here.
+
+Normalization priority (design §4.2):
+    CWE intersection (by-cwe) > rule-token normalization (by-rule-token) > unmatched.
+"""
+
+from __future__ import annotations
+
+import json
+import re
+from collections.abc import Iterable, Mapping
+from dataclasses import dataclass, field
+from pathlib import Path
+
+from security_scanner.core.vulnerability.model import (
+    VulnerabilityFinding,
+    VulnerabilityLocation,
+    compute_vulnerability_finding_id,
+)
+
+# CWE token shape, mirrors sarif._CWE_RE so canonical ids line up (``CWE-NNN``).
+_CWE_RE = re.compile(r"(?i)cwe[-_/ ]?(\d{1,5})")
+
+
+# ---------------------------------------------------------------------------
+# Redacted alert value object (secret GhasAlertRecord parallel)
+# ---------------------------------------------------------------------------
+
+@dataclass(frozen=True)
+class CodeScanAlertRecord:
+    """Redacted GHAS code-scanning alert (no raw message/snippet/private path).
+
+    A pure value object: it carries only the redacted fields the parity matcher
+    needs and intentionally has NO storage-store coupling.
+    """
+
+    repository: str
+    alert_number: int
+    rule_id: str
+    security_severity_level: str | None
+    cwe_ids: tuple[str, ...]
+    state: str  # open | dismissed | fixed
+    dismissed_reason: str | None  # false positive | won't fix | used in tests | None
+    location_path: str | None
+    location_start_line: int | None
+    location_end_line: int | None
+    fetched_at: str | None = None
+    source_tool: str = "ghas-code-scanning"
+
+
+# ---------------------------------------------------------------------------
+# CWE bridge: CWE id -> canonical vuln class
+# ---------------------------------------------------------------------------
+
+# Initial coverage of the core vuln classes (design §2/§4.2). Extend by adding a
+# row; an EMPTY bridge maps nothing so the adversarial fixtures go red when the
+# CWE path is disabled.
+DEFAULT_CWE_BRIDGE: dict[str, str] = {
+    "CWE-89": "sql-injection",
+    "CWE-79": "xss",
+    "CWE-22": "path-traversal",
+    "CWE-23": "path-traversal",
+    "CWE-36": "path-traversal",
+    "CWE-78": "command-injection",
+    "CWE-77": "command-injection",
+    "CWE-918": "ssrf",
+}
+
+
+# Rule-token classes: canonical class -> the exact core token SET that identifies
+# it. Matching requires an EXACT set match of the surviving core tokens, so
+# ``path-traversal`` never matches ``open-redirect`` (no partial overlap).
+DEFAULT_RULE_TOKEN_CLASSES: dict[str, frozenset[str]] = {
+    "sql-injection": frozenset({"sql", "injection"}),
+    "xss": frozenset({"xss"}),
+    "path-traversal": frozenset({"path", "traversal"}),
+    "command-injection": frozenset({"command", "injection"}),
+    "ssrf": frozenset({"ssrf"}),
+    "open-redirect": frozenset({"open", "redirect"}),
+}
+
+
+# Stop-tokens stripped before exact-set comparison: language, tool, taxonomy and
+# generic-audit noise that does not identify the vuln class.
+DEFAULT_STOP_TOKENS: frozenset[str] = frozenset(
+    {
+        "audit",
+        "lang",
+        "language",
+        "security",
+        "py",
+        "python",
+        "js",
+        "javascript",
+        "ts",
+        "typescript",
+        "java",
+        "go",
+        "golang",
+        "rb",
+        "ruby",
+        "php",
+        "cs",
+        "csharp",
+        "ql",
+        "codeql",
+        "semgrep",
+        "external",
+        "cwe",
+        "rule",
+        "rules",
+        "best",
+        "practice",
+        "practices",
+        "problem",
+        "problems",
+        "warning",
+        "error",
+        "vuln",
+        "vulnerability",
+        "generic",
+    }
+)
+
+
+# Set-level synonym rewrites applied AFTER stop-token removal, so tool-specific
+# vuln-class idioms fold onto one canonical core-token set before the exact-set
+# comparison. Each rule rewrites a matched subset to a canonical set. Example:
+# CodeQL's ``path-injection`` ({path, injection}) means the same class as
+# ``path-traversal`` ({path, traversal}).
+_SET_SYNONYMS: tuple[tuple[frozenset[str], frozenset[str]], ...] = (
+    (frozenset({"path", "injection"}), frozenset({"path", "traversal"})),
+)
+
+
+def _split_tokens(rule_id: str) -> list[str]:
+    """Tokenize a rule id on common separators, lower-cased."""
+    raw = re.split(r"[^a-zA-Z0-9]+", rule_id.strip().lower())
+    return [token for token in raw if token]
+
+
+def _core_tokens(
+    rule_id: str,
+    stop_tokens: frozenset[str],
+) -> frozenset[str]:
+    """Return the surviving core vuln-class tokens after stop-token removal."""
+    tokens = frozenset(t for t in _split_tokens(rule_id) if t not in stop_tokens)
+    # Fold known source/sink synonym sets (path-injection -> path-traversal).
+    for matched, canonical in _SET_SYNONYMS:
+        if tokens == matched:
+            return canonical
+    return tokens
+
+
+def extract_cwe_ids(values: Iterable[str]) -> tuple[str, ...]:
+    """Normalize arbitrary CWE-bearing tokens into ``CWE-NNN`` ids (sorted, unique)."""
+    found: set[str] = set()
+    for value in values:
+        match = _CWE_RE.search(str(value))
+        if match:
+            found.add(f"CWE-{int(match.group(1))}")
+    return tuple(sorted(found))
+
+
+@dataclass(frozen=True)
+class RuleClassNormalizer:
+    """Maps a ``rule_id`` and/or ``cwe_ids`` onto a canonical vuln class.
+
+    An EMPTY ``cwe_bridge`` together with ``enable_rule_token=False`` normalizes
+    nothing — every lookup misses. That is what makes the same-vuln-different-
+    rule.id fixture go red when normalization is disabled.
+    """
+
+    cwe_bridge: Mapping[str, str] = field(
+        default_factory=lambda: dict(DEFAULT_CWE_BRIDGE)
+    )
+    rule_token_classes: Mapping[str, frozenset[str]] = field(
+        default_factory=lambda: dict(DEFAULT_RULE_TOKEN_CLASSES)
+    )
+    stop_tokens: frozenset[str] = DEFAULT_STOP_TOKENS
+    enable_rule_token: bool = True
+
+    def cwe_class(self, cwe_ids: Iterable[str]) -> str | None:
+        """First bridgeable CWE -> its canonical class, else ``None``."""
+        for cwe in extract_cwe_ids(cwe_ids):
+            mapped = self.cwe_bridge.get(cwe)
+            if mapped is not None:
+                return mapped
+        return None
+
+    def has_bridgeable_cwe(self, cwe_ids: Iterable[str]) -> bool:
+        return self.cwe_class(cwe_ids) is not None
+
+    def rule_token_class(self, rule_id: str) -> frozenset[str] | None:
+        """Exact core-token set for ``rule_id``, or ``None``.
+
+        Returns the surviving core-token *set* (so callers can compare two sides
+        for EXACT equality — no partial overlap). Returns ``None`` when rule-token
+        normalization is disabled or no core tokens survive.
+        """
+        if not self.enable_rule_token:
+            return None
+        core = _core_tokens(rule_id, self.stop_tokens)
+        if not core:
+            return None
+        return core
+
+    def rule_token_canonical(self, rule_id: str) -> str | None:
+        """Canonical class name for a rule-token set, if it matches a known class."""
+        core = self.rule_token_class(rule_id)
+        if core is None:
+            return None
+        for canonical, token_set in self.rule_token_classes.items():
+            if token_set == core:
+                return canonical
+        return None
+
+
+# ---------------------------------------------------------------------------
+# Snapshot fixture loading (provenance fail-closed)
+# ---------------------------------------------------------------------------
+
+@dataclass(frozen=True)
+class CodeScanSnapshot:
+    """Loaded synthetic code-scanning snapshot fixture (provenance-guarded)."""
+
+    repo_full_name: str
+    source: str
+    alerts: list[CodeScanAlertRecord]
+    findings: list[VulnerabilityFinding]
+    fetched_at: str | None = None
+
+
+def load_codescan_snapshot(path: str | Path) -> CodeScanSnapshot:
+    """Load a synthetic code-scanning snapshot fixture.
+
+    Fails closed unless ``source`` is exactly ``synthetic`` — a real (or
+    unmarked) snapshot must never feed the autonomous harness.
+    """
+    data = json.loads(Path(path).read_text(encoding="utf-8"))
+    source = str(data.get("source", "")).strip().lower()
+    if source != "synthetic":
+        raise ValueError(
+            "code-scanning snapshot must carry provenance marker source: synthetic "
+            f"(got {data.get('source')!r}); refusing to load"
+        )
+
+    repo_full_name = str(data["repoFullName"])
+    fetched_at = data.get("fetchedAt")
+
+    alerts = [
+        _alert_from_dict(repo_full_name, item, fetched_at)
+        for item in data.get("alerts", [])
+    ]
+    findings = [
+        _finding_from_dict(item) for item in data.get("findings", [])
+    ]
+    return CodeScanSnapshot(
+        repo_full_name=repo_full_name,
+        source=source,
+        alerts=alerts,
+        findings=findings,
+        fetched_at=fetched_at,
+    )
+
+
+def _alert_from_dict(
+    repo_full_name: str, item: dict, fetched_at: str | None
+) -> CodeScanAlertRecord:
+    start = item.get("lineStart")
+    end = item.get("lineEnd")
+    return CodeScanAlertRecord(
+        repository=repo_full_name,
+        alert_number=int(item["alertNumber"]),
+        rule_id=str(item["ruleId"]),
+        security_severity_level=item.get("securitySeverityLevel"),
+        cwe_ids=extract_cwe_ids(item.get("cweIds", [])),
+        state=str(item.get("state", "open")),
+        dismissed_reason=item.get("dismissedReason"),
+        location_path=item.get("filePath"),
+        location_start_line=int(start) if start is not None else None,
+        location_end_line=int(end) if end is not None else None,
+        fetched_at=fetched_at,
+    )
+
+
+def _finding_from_dict(item: dict) -> VulnerabilityFinding:
+    file_path = str(item["filePath"])
+    line_start = int(item["lineStart"])
+    rule_id = str(item["ruleId"])
+    source_tool = str(item.get("sourceTool", "semgrep"))
+    message = str(item.get("message", "synthetic finding"))
+    finding_id = compute_vulnerability_finding_id(
+        source_tool=source_tool,
+        rule_id=rule_id,
+        partial_fingerprints=None,
+        file_path=file_path,
+        line_start=line_start,
+        message=message,
+    )
+    return VulnerabilityFinding(
+        finding_id=finding_id,
+        rule_id=rule_id,
+        message=message,
+        primary_location=VulnerabilityLocation(
+            file_path=file_path,
+            line_start=line_start,
+            line_end=item.get("lineEnd"),
+        ),
+        source_tool=source_tool,
+        cwe_ids=extract_cwe_ids(item.get("cweIds", [])),
+    )
+
+
+__all__ = [
+    "CodeScanAlertRecord",
+    "CodeScanSnapshot",
+    "DEFAULT_CWE_BRIDGE",
+    "DEFAULT_RULE_TOKEN_CLASSES",
+    "DEFAULT_STOP_TOKENS",
+    "RuleClassNormalizer",
+    "extract_cwe_ids",
+    "load_codescan_snapshot",
+]
diff --git a/src/security_scanner/core/vulnerability/codescan_parity.py b/src/security_scanner/core/vulnerability/codescan_parity.py
new file mode 100644
index 0000000..36e2068
--- /dev/null
+++ b/src/security_scanner/core/vulnerability/codescan_parity.py
@@ -0,0 +1,507 @@
+"""GHAS code-scanning alert -> VulnerabilityEvaluationKey parity adapter (M1).
+
+This is the 1:1 transfer of the proven secret-track parity matcher
+(``baseline/ghas_api/parity.py``) to the code-vulnerability domain. It turns
+GHAS code-scanning alerts (:class:`CodeScanAlertRecord`) and our own SARIF
+findings (:class:`VulnerabilityFinding`) into the
+``VulnerabilityExpectedFinding`` / ``VulnerabilityEvaluationKey`` shape that
+``core.vulnerability.evaluation`` already understands, so the precision/recall
+*formula* and gate *threshold* judgement are reused VERBATIM — no new metric code.
+
+The adapter owns exactly the responsibilities the metrics layer cannot:
+
+(a) **rule_id/CWE -> canonical vuln class** via
+    :class:`~security_scanner.core.vulnerability.codescan.RuleClassNormalizer`,
+    with priority CWE-intersection (by-cwe) > rule-token (by-rule-token) >
+    unmatched, so a CodeQL/Semgrep token-mismatch no longer splits one vuln in
+    two.
+(b) **state-aware truth filter** (design §4.2 denominator formula) — recall
+    denominator = alerts in ``state ∈ {open, fixed}`` only;
+    ``dismissed_reason ∈ {false positive, used in tests}`` is an explicit
+    FP-oracle (precision penalty when our finding lands there, NOT in the recall
+    denominator); ``won't fix`` is TP-non-blocking (excluded from the recall
+    denominator, no precision penalty).
+(c) **line-window matching** — a finding matches an alert when their line
+    intervals overlap or are within ``±N`` lines (TRUE window, no
+    ``start_line//N`` quantization). Because this is a fuzzy join it cannot be
+    expressed as exact-key equality, so the adapter resolves the TP/FP/FN
+    pairing itself (1:1 greedy via ``_AlertSlot.consumed``) and then hands
+    canonical keys to ``evaluate_vulnerability_findings`` for the headline
+    numbers.
+
+This module is a pure function over its inputs: it performs no network calls and
+has no durable-store coupling.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Iterable, Sequence
+from dataclasses import dataclass
+
+from security_scanner.core.vulnerability.codescan import (
+    CodeScanAlertRecord,
+    RuleClassNormalizer,
+)
+from security_scanner.core.vulnerability.evaluation import (
+    VulnerabilityEvaluationResult,
+    VulnerabilityExpectedFinding,
+    evaluate_vulnerability_findings,
+)
+from security_scanner.core.vulnerability.model import (
+    VulnerabilityFinding,
+    VulnerabilityLocation,
+)
+
+# Positive-truth states for the recall denominator (design §4.2 line 330-333).
+CODESCAN_POSITIVE_TRUTH_STATES: tuple[str, ...] = ("open", "fixed")
+# Dismissed reasons that are an explicit FP-oracle (precision penalty).
+CODESCAN_FP_ORACLE_REASONS: tuple[str, ...] = ("false positive", "used in tests")
+# Dismissed reason that is TP-non-blocking (excluded from recall, no penalty).
+CODESCAN_NON_BLOCKING_REASON: str = "won't fix"
+
+
+@dataclass(frozen=True)
+class ParityConfig:
+    """Tunable parity-matching policy.
+
+    ``line_window`` is the ``±N`` window (interval overlap always matches
+    regardless of N). It is FIXED to a concrete value in M1 (closes open
+    question VD-07) and pinned by the fixtures. ``positive_truth_states``
+    parameterizes the state-aware truth filter so a test can disable it (and
+    prove the resulting recall regression).
+    """
+
+    line_window: int = 2
+    positive_truth_states: tuple[str, ...] = CODESCAN_POSITIVE_TRUTH_STATES
+
+
+@dataclass(frozen=True)
+class CodeScanParityResult:
+    """Per-repo code-scanning parity outcome.
+
+    ``detection`` carries the reused metrics-layer result (so ``.precision`` /
+    ``.recall`` come straight from ``core.vulnerability.evaluation``). The tier
+    counts and FP-oracle counters are parity-specific buckets.
+    """
+
+    repository: str
+    detection: VulnerabilityEvaluationResult
+    matched_by_cwe: int
+    matched_by_rule_token: int
+    unmatched: int
+    dismissed_fp_hit: int
+    cwe_deficit_rate: float
+    rule_token_rescue_rate: float
+
+    @property
+    def precision(self) -> float:
+        return self.detection.precision
+
+    @property
+    def recall(self) -> float:
+        return self.detection.recall
+
+
+@dataclass(frozen=True)
+class MacroCodeScanParityResult:
+    """Macro (per-repo averaged) code-scanning parity summary.
+
+    The vuln-domain analog of the secret-track ``MacroParityResult``. This is
+    aggregation ONLY: ``macro_precision`` / ``macro_recall`` are the unweighted
+    average of the per-repo numbers that already came from the metrics layer
+    (``CodeScanParityResult.detection.precision`` / ``.recall``). It introduces
+    NO new precision/recall formula — the tier counters and FP-oracle totals are
+    summed buckets carried forward from the per-repo matcher.
+    """
+
+    repo_count: int
+    macro_precision: float
+    macro_recall: float
+    total_matched_by_cwe: int
+    total_matched_by_rule_token: int
+    total_unmatched: int
+    total_dismissed_fp_hit: int
+    macro_cwe_deficit_rate: float
+    macro_rule_token_rescue_rate: float
+
+
+def aggregate_codescan_parity(
+    results: Iterable[CodeScanParityResult],
+) -> MacroCodeScanParityResult:
+    """Macro-average per-repo code-scanning precision/recall (SLO consumes macro).
+
+    Mirrors ``baseline.ghas_api.parity.aggregate_repo_parity``: the per-repo
+    precision/recall come straight from the metrics layer, so this is pure
+    averaging plus summed tier/FP-oracle buckets — not a TP/(TP+FP) re-derivation.
+    """
+    results = list(results)
+    if not results:
+        return MacroCodeScanParityResult(
+            repo_count=0,
+            macro_precision=1.0,
+            macro_recall=1.0,
+            total_matched_by_cwe=0,
+            total_matched_by_rule_token=0,
+            total_unmatched=0,
+            total_dismissed_fp_hit=0,
+            macro_cwe_deficit_rate=0.0,
+            macro_rule_token_rescue_rate=0.0,
+        )
+    n = len(results)
+    return MacroCodeScanParityResult(
+        repo_count=n,
+        macro_precision=sum(r.detection.precision for r in results) / n,
+        macro_recall=sum(r.detection.recall for r in results) / n,
+        total_matched_by_cwe=sum(r.matched_by_cwe for r in results),
+        total_matched_by_rule_token=sum(r.matched_by_rule_token for r in results),
+        total_unmatched=sum(r.unmatched for r in results),
+        total_dismissed_fp_hit=sum(r.dismissed_fp_hit for r in results),
+        macro_cwe_deficit_rate=sum(r.cwe_deficit_rate for r in results) / n,
+        macro_rule_token_rescue_rate=sum(
+            r.rule_token_rescue_rate for r in results
+        )
+        / n,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Truth classification (state-aware denominator)
+# ---------------------------------------------------------------------------
+
+def _norm(value: str | None) -> str:
+    return (value or "").strip().lower()
+
+
+def _is_positive_truth(alert: CodeScanAlertRecord, config: ParityConfig) -> bool:
+    return _norm(alert.state) in config.positive_truth_states
+
+
+def _is_fp_oracle(alert: CodeScanAlertRecord) -> bool:
+    return _norm(alert.dismissed_reason) in CODESCAN_FP_ORACLE_REASONS
+
+
+# ---------------------------------------------------------------------------
+# Core fuzzy join
+# ---------------------------------------------------------------------------
+
+@dataclass
+class _AlertSlot:
+    record: CodeScanAlertRecord
+    cwe_class: str | None
+    rule_token: frozenset[str] | None
+    consumed: bool = False
+
+
+def _alert_lines(alert: CodeScanAlertRecord) -> tuple[int, int]:
+    start = alert.location_start_line
+    if start is None:
+        return (0, 0)
+    end = alert.location_end_line if alert.location_end_line is not None else start
+    lo, hi = (start, end)
+    if hi < lo:
+        lo, hi = hi, lo
+    return (lo, hi)
+
+
+def _lines_match(
+    finding_line: int,
+    alert_interval: tuple[int, int],
+    window: int,
+) -> bool:
+    lo, hi = alert_interval
+    # Interval overlap (finding line inside the alert span).
+    if lo <= finding_line <= hi:
+        return True
+    # ±N window around the nearest interval endpoint.
+    nearest = lo if finding_line < lo else hi
+    return abs(finding_line - nearest) <= window
+
+
+def _slot_lines_match(
+    slot: _AlertSlot, finding: VulnerabilityFinding, window: int
+) -> bool:
+    finding_line = finding.primary_location.line_start or 0
+    return _lines_match(finding_line, _alert_lines(slot.record), window)
+
+
+def _rule_class_match(
+    slot: _AlertSlot,
+    finding_cwe_class: str | None,
+    finding_token: frozenset[str] | None,
+) -> str | None:
+    """Return the match tier (``by-cwe`` / ``by-rule-token``) or ``None``.
+
+    Priority (design §4.2): by-cwe first (both sides share a canonical CWE
+    class); else by-rule-token (EXACT core-token set equality — no partial
+    overlap); else no match.
+    """
+    if (
+        slot.cwe_class is not None
+        and finding_cwe_class is not None
+        and slot.cwe_class == finding_cwe_class
+    ):
+        return "by-cwe"
+    if (
+        finding_cwe_class is None
+        and slot.cwe_class is None
+        and slot.rule_token is not None
+        and finding_token is not None
+        and slot.rule_token == finding_token
+    ):
+        return "by-rule-token"
+    return None
+
+
+def compare_codescan_alerts_with_findings(
+    *,
+    repository: str,
+    alerts: Sequence[CodeScanAlertRecord],
+    findings: Sequence[VulnerabilityFinding],
+    normalizer: RuleClassNormalizer,
+    config: ParityConfig | None = None,
+) -> CodeScanParityResult:
+    """Compute per-repo code-scanning parity for one GHAS-enabled repo.
+
+    Returns the metrics-layer ``VulnerabilityEvaluationResult`` (so
+    precision/recall come straight from ``core.vulnerability.evaluation``) plus
+    the parity-specific tier counts and FP-oracle counters.
+    """
+    config = config or ParityConfig()
+
+    # 1. State-aware truth filter: positive-truth alerts (recall denominator) are
+    #    open + fixed only. Dismissed-FP-oracle alerts are tracked separately and
+    #    won't-fix alerts are simply excluded.
+    truth_alerts = [
+        a
+        for a in alerts
+        if a.location_path is not None
+        and a.location_start_line is not None
+        and _is_positive_truth(a, config)
+    ]
+    fp_oracle_alerts = [
+        a
+        for a in alerts
+        if a.location_path is not None
+        and a.location_start_line is not None
+        and not _is_positive_truth(a, config)
+        and _is_fp_oracle(a)
+    ]
+
+    truth_slots = [
+        _AlertSlot(
+            record=a,
+            cwe_class=normalizer.cwe_class(a.cwe_ids),
+            rule_token=normalizer.rule_token_class(a.rule_id),
+        )
+        for a in truth_alerts
+    ]
+    fp_oracle_slots = [
+        _AlertSlot(
+            record=a,
+            cwe_class=normalizer.cwe_class(a.cwe_ids),
+            rule_token=normalizer.rule_token_class(a.rule_id),
+        )
+        for a in fp_oracle_alerts
+    ]
+
+    # CWE-deficit meta-metric over the positive-truth alert population.
+    cwe_deficit_rate = _deficit_rate(
+        normalizer.has_bridgeable_cwe(a.cwe_ids) for a in truth_alerts
+    )
+
+    expected: list[VulnerabilityExpectedFinding] = []
+    actual: list[VulnerabilityFinding] = []
+    matched_by_cwe = 0
+    matched_by_rule_token = 0
+    unmatched = 0
+    dismissed_fp_hit = 0
+    match_index = 0
+
+    # 2. Fuzzy join: each finding tries to claim one unconsumed positive-truth
+    #    alert in the same file with a matching rule-class and a tolerated line.
+    for finding in findings:
+        finding_cwe_class = normalizer.cwe_class(finding.cwe_ids)
+        finding_token = normalizer.rule_token_class(finding.rule_id)
+
+        slot, tier = _find_matching_slot(
+            finding, finding_cwe_class, finding_token, truth_slots, config
+        )
+        if slot is not None:
+            slot.consumed = True
+            match_index += 1
+            if tier == "by-cwe":
+                matched_by_cwe += 1
+            else:
+                matched_by_rule_token += 1
+            shared_key = _matched_key(repository, match_index, tier)
+            expected.append(shared_key)
+            actual.append(_canonical_finding(finding, shared_key))
+            continue
+
+        # Not a positive-truth match. Is this finding sitting on a dismissed
+        # false-positive / used-in-tests location? That is an explicit FP-oracle
+        # hit: a precision penalty AND a false positive.
+        oracle = _find_fp_oracle_slot(
+            finding, finding_cwe_class, finding_token, fp_oracle_slots, config
+        )
+        if oracle is not None:
+            oracle.consumed = True
+            dismissed_fp_hit += 1
+            actual.append(_local_only_finding(repository, finding, match_index))
+            match_index += 1
+            unmatched += 1
+            continue
+
+        # Pure local-only finding -> false positive.
+        actual.append(_local_only_finding(repository, finding, match_index))
+        match_index += 1
+        unmatched += 1
+
+    # 3. Unconsumed positive-truth alerts -> false negatives (ghas-only truth).
+    for slot in truth_slots:
+        if not slot.consumed:
+            expected.append(_ghas_only_key(slot.record))
+
+    # by-rule-token rescue rate = fraction of matches that needed the rule-token
+    # fallback (i.e. could not be resolved by the CWE bridge).
+    total_matched = matched_by_cwe + matched_by_rule_token
+    rule_token_rescue_rate = _rate(matched_by_rule_token, total_matched)
+
+    detection = evaluate_vulnerability_findings(expected, actual)
+
+    return CodeScanParityResult(
+        repository=repository,
+        detection=detection,
+        matched_by_cwe=matched_by_cwe,
+        matched_by_rule_token=matched_by_rule_token,
+        unmatched=unmatched,
+        dismissed_fp_hit=dismissed_fp_hit,
+        cwe_deficit_rate=cwe_deficit_rate,
+        rule_token_rescue_rate=rule_token_rescue_rate,
+    )
+
+
+def _find_matching_slot(
+    finding: VulnerabilityFinding,
+    finding_cwe_class: str | None,
+    finding_token: frozenset[str] | None,
+    slots: list[_AlertSlot],
+    config: ParityConfig,
+) -> tuple[_AlertSlot | None, str | None]:
+    """First unconsumed positive-truth slot in the same file/window/rule-class."""
+    for slot in slots:
+        if slot.consumed:
+            continue
+        if slot.record.location_path != finding.primary_location.file_path:
+            continue
+        if not _slot_lines_match(slot, finding, config.line_window):
+            continue
+        tier = _rule_class_match(slot, finding_cwe_class, finding_token)
+        if tier is not None:
+            return slot, tier
+    return None, None
+
+
+def _find_fp_oracle_slot(
+    finding: VulnerabilityFinding,
+    finding_cwe_class: str | None,
+    finding_token: frozenset[str] | None,
+    slots: list[_AlertSlot],
+    config: ParityConfig,
+) -> _AlertSlot | None:
+    """First unconsumed dismissed-FP-oracle slot the finding lands on."""
+    for slot in slots:
+        if slot.consumed:
+            continue
+        if slot.record.location_path != finding.primary_location.file_path:
+            continue
+        if not _slot_lines_match(slot, finding, config.line_window):
+            continue
+        if _rule_class_match(slot, finding_cwe_class, finding_token) is not None:
+            return slot
+    return None
+
+
+# ---------------------------------------------------------------------------
+# Meta-metric helpers
+# ---------------------------------------------------------------------------
+
+def _deficit_rate(flags: object) -> float:
+    """Fraction lacking a bridgeable CWE over an iterable of has-CWE booleans."""
+    items = list(flags)
+    if not items:
+        return 0.0
+    deficient = sum(1 for has_cwe in items if not has_cwe)
+    return deficient / len(items)
+
+
+def _rate(part: int, total: int) -> float:
+    return part / total if total else 0.0
+
+
+# ---------------------------------------------------------------------------
+# Canonical-key synthesis (kept stable so evaluation.py keys line up 1:1)
+# ---------------------------------------------------------------------------
+
+def _matched_key(
+    repository: str, index: int, tier: str | None
+) -> VulnerabilityExpectedFinding:
+    return VulnerabilityExpectedFinding(
+        file_path=f"__matched__/{index}",
+        line_start=index,
+        rule_id=f"__matched__:{tier}",
+    )
+
+
+def _canonical_finding(
+    finding: VulnerabilityFinding, shared_key: VulnerabilityExpectedFinding
+) -> VulnerabilityFinding:
+    """A VulnerabilityFinding whose EvaluationKey equals ``shared_key`` (TP)."""
+    return VulnerabilityFinding(
+        finding_id=finding.finding_id,
+        rule_id=shared_key.rule_id,
+        message=finding.message,
+        primary_location=VulnerabilityLocation(
+            file_path=shared_key.file_path,
+            line_start=shared_key.line_start,
+        ),
+        source_tool=finding.source_tool,
+        cwe_ids=finding.cwe_ids,
+    )
+
+
+def _ghas_only_key(alert: CodeScanAlertRecord) -> VulnerabilityExpectedFinding:
+    return VulnerabilityExpectedFinding(
+        file_path=f"__ghas_only__/{alert.location_path}",
+        line_start=alert.location_start_line or 0,
+        rule_id=f"ghas:{alert.rule_id}",
+    )
+
+
+def _local_only_finding(
+    repository: str, finding: VulnerabilityFinding, index: int
+) -> VulnerabilityFinding:
+    """A finding with a guaranteed-unique key so it lands as a false positive."""
+    return VulnerabilityFinding(
+        finding_id=finding.finding_id,
+        rule_id=f"__local_only__/{index}/{finding.rule_id}",
+        message=finding.message,
+        primary_location=VulnerabilityLocation(
+            file_path=f"__local_only__/{index}/{finding.primary_location.file_path}",
+            line_start=finding.primary_location.line_start or 0,
+        ),
+        source_tool=finding.source_tool,
+        cwe_ids=finding.cwe_ids,
+    )
+
+
+__all__ = [
+    "CODESCAN_POSITIVE_TRUTH_STATES",
+    "CODESCAN_FP_ORACLE_REASONS",
+    "CODESCAN_NON_BLOCKING_REASON",
+    "ParityConfig",
+    "CodeScanParityResult",
+    "MacroCodeScanParityResult",
+    "aggregate_codescan_parity",
+    "compare_codescan_alerts_with_findings",
+]
diff --git a/src/security_scanner/core/vulnerability/evaluation.py b/src/security_scanner/core/vulnerability/evaluation.py
index e171982..c5f6b2d 100644
--- a/src/security_scanner/core/vulnerability/evaluation.py
+++ b/src/security_scanner/core/vulnerability/evaluation.py
@@ -1,13 +1,35 @@
-"""Synthetic corpus evaluation for code vulnerability findings."""
+"""Synthetic corpus evaluation for code vulnerability findings.
+
+Two matching semantics coexist here, by DESIGN (design §E lines 326-329):
+
+- The ORIGINAL exact-key path — :class:`VulnerabilityEvaluationKey`
+  ``(file_path, line_start, rule_id)`` full equality — used by
+  :func:`evaluate_vulnerability_findings`. This is the legacy naive matcher and
+  its behavior is FROZEN: every existing caller / test keeps the same results.
+- The M2 normalization-aware path — :func:`evaluate_vulnerability_findings_
+  normalized` — reuses the M1
+  :class:`~security_scanner.core.vulnerability.codescan.RuleClassNormalizer`
+  (CWE-bridge / rule-token canonicalization) and a line-window so a CodeQL-style
+  and a Semgrep-style ``ruleId`` for the SAME vuln class match. It is the SAME
+  rule-class + line-window semantics the M1 parity matcher
+  (``codescan_parity.py``) uses, satisfying the VFR8 consistency condition.
+
+The boundary is deliberate: the normalized path is a NEW function. It does NOT
+mutate :func:`evaluate_vulnerability_findings` or the exact-key. Like the M1
+matcher it pre-normalizes both sides into synthetic canonical keys and then hands
+them to :func:`evaluate_vulnerability_findings` for the headline precision/recall
+— so there is ZERO new precision/recall formula.
+"""
 
 from __future__ import annotations
 
 import json
 from collections import Counter
-from collections.abc import Iterable
+from collections.abc import Iterable, Sequence
 from dataclasses import dataclass
 from pathlib import Path
 
+from security_scanner.core.vulnerability.codescan import RuleClassNormalizer
 from security_scanner.core.vulnerability.model import VulnerabilityFinding
 
 
@@ -204,3 +226,269 @@ def _append_key_section(
     lines.append(title + ":")
     for key in keys:
         lines.append(f"  - {key.display()}")
+
+
+# ---------------------------------------------------------------------------
+# M2 normalization-aware path (design §E) — NEW, additive. Does NOT change the
+# exact-key behavior of evaluate_vulnerability_findings above.
+# ---------------------------------------------------------------------------
+
+# Concrete line-window N, pinned to the M1 parity matcher's value (VFR8
+# consistency: the synthetic regression gate and the parity matcher share the
+# same line-window). See codescan_parity.ParityConfig.line_window.
+NORMALIZED_LINE_WINDOW: int = 2
+
+
+@dataclass(frozen=True)
+class NormalizedExpectedFinding:
+    """Expected finding carrying enough metadata to derive its canonical class.
+
+    Unlike :class:`VulnerabilityExpectedFinding` (which pins an exact ``rule_id``)
+    this carries the raw ``rule_id`` / ``cwe_ids`` so the SAME
+    :class:`RuleClassNormalizer` used by the M1 parity matcher derives the
+    canonical class on the expected side too.
+    """
+
+    file_path: str
+    line_start: int
+    rule_id: str
+    cwe_ids: tuple[str, ...] = ()
+
+    @classmethod
+    def from_dict(cls, data: dict) -> NormalizedExpectedFinding:
+        return cls(
+            file_path=str(data["filePath"]),
+            line_start=int(data["lineStart"]),
+            rule_id=str(data["ruleId"]),
+            cwe_ids=tuple(str(item) for item in data.get("cweIds", [])),
+        )
+
+
+def load_vulnerability_corpus_normalized(
+    path: str | Path,
+) -> tuple[list[NormalizedExpectedFinding], list[VulnerabilityFinding]]:
+    """Load the M2 5-class synthetic corpus snapshot (provenance fail-closed).
+
+    Returns ``(expected, actual)`` ready for
+    :func:`evaluate_vulnerability_findings_normalized`. Refuses to load a snapshot
+    whose ``source`` is not exactly ``synthetic`` so a real (or unmarked) corpus
+    can never feed the autonomous regression gate.
+    """
+    data = json.loads(Path(path).read_text(encoding="utf-8"))
+    source = str(data.get("source", "")).strip().lower()
+    if source != "synthetic":
+        raise ValueError(
+            "vulnerability corpus snapshot must carry provenance marker "
+            f"source: synthetic (got {data.get('source')!r}); refusing to load"
+        )
+    expected = [
+        NormalizedExpectedFinding.from_dict(item)
+        for item in data.get("expectedFindings", [])
+    ]
+    actual = [
+        _normalized_finding_from_dict(item)
+        for item in data.get("actualFindings", [])
+    ]
+    return expected, actual
+
+
+def _normalized_finding_from_dict(item: dict) -> VulnerabilityFinding:
+    from security_scanner.core.vulnerability.model import (
+        VulnerabilityLocation,
+        compute_vulnerability_finding_id,
+    )
+
+    file_path = str(item["filePath"])
+    line_start = int(item["lineStart"])
+    rule_id = str(item["ruleId"])
+    source_tool = str(item.get("sourceTool", "semgrep"))
+    finding_id = compute_vulnerability_finding_id(
+        source_tool=source_tool,
+        rule_id=rule_id,
+        partial_fingerprints=None,
+        file_path=file_path,
+        line_start=line_start,
+        message="synthetic finding",
+    )
+    return VulnerabilityFinding(
+        finding_id=finding_id,
+        rule_id=rule_id,
+        message="synthetic finding",
+        primary_location=VulnerabilityLocation(
+            file_path=file_path,
+            line_start=line_start,
+            line_end=item.get("lineEnd"),
+        ),
+        source_tool=source_tool,
+        cwe_ids=tuple(str(c) for c in item.get("cweIds", [])),
+    )
+
+
+def _canonical_class(
+    normalizer: RuleClassNormalizer,
+    *,
+    cwe_ids: Iterable[str],
+    rule_id: str,
+) -> str | None:
+    """Canonical vuln class via the shared M1 normalizer (CWE bridge > rule-token)."""
+    cwe_class = normalizer.cwe_class(cwe_ids)
+    if cwe_class is not None:
+        return cwe_class
+    return normalizer.rule_token_canonical(rule_id)
+
+
+@dataclass
+class _ExpectedSlot:
+    expected: NormalizedExpectedFinding | VulnerabilityExpectedFinding
+    vuln_class: str | None
+    consumed: bool = False
+
+
+def _line_window_match(a: int, b: int, window: int) -> bool:
+    return abs(a - b) <= window
+
+
+def evaluate_vulnerability_findings_normalized(
+    expected_findings: Sequence[NormalizedExpectedFinding | VulnerabilityExpectedFinding],
+    actual_findings: Sequence[VulnerabilityFinding],
+    *,
+    normalizer: RuleClassNormalizer | None = None,
+    line_window: int = NORMALIZED_LINE_WINDOW,
+) -> VulnerabilityEvaluationResult:
+    """Normalization-aware evaluation (design §E) reusing M1's normalizer.
+
+    Accepts either :class:`NormalizedExpectedFinding` (carries ``cwe_ids`` so the
+    CWE bridge can fire) or a legacy :class:`VulnerabilityExpectedFinding` (no
+    ``cwe_ids`` -> class derived from ``rule_id`` tokens only).
+
+    Like the M1 parity matcher this performs a fuzzy join on
+    ``(file_path, canonical vuln class, line-window)`` — NOT exact-key equality —
+    using the shared :class:`RuleClassNormalizer`. It then synthesizes canonical
+    keys for each TP / FP / FN and hands them to
+    :func:`evaluate_vulnerability_findings` so the precision/recall FORMULA is
+    reused verbatim (zero new metric code). The existing exact-key path is
+    untouched.
+
+    A finding matches an expected entry when they share a file, a non-``None``
+    canonical class, and their lines are within ``±line_window``. Matching is 1:1
+    greedy (each expected slot consumed once).
+    """
+    normalizer = normalizer or RuleClassNormalizer()
+    slots = [
+        _ExpectedSlot(
+            expected=item,
+            vuln_class=_canonical_class(
+                normalizer,
+                cwe_ids=getattr(item, "cwe_ids", ()),
+                rule_id=item.rule_id,
+            ),
+        )
+        for item in expected_findings
+    ]
+
+    matched: list[VulnerabilityExpectedFinding] = []
+    actual_keys: list[VulnerabilityFinding] = []
+    match_index = 0
+    fp_index = 0
+
+    for finding in actual_findings:
+        finding_class = _canonical_class(
+            normalizer,
+            cwe_ids=finding.cwe_ids,
+            rule_id=finding.rule_id,
+        )
+        slot = _find_expected_slot(finding, finding_class, slots, line_window)
+        if slot is not None:
+            slot.consumed = True
+            match_index += 1
+            shared = _normalized_matched_key(match_index)
+            matched.append(shared)
+            actual_keys.append(_normalized_canonical_finding(finding, shared))
+        else:
+            fp_index += 1
+            actual_keys.append(_normalized_local_only_finding(finding, fp_index))
+
+    expected_keys = list(matched)
+    for slot in slots:
+        if not slot.consumed:
+            expected_keys.append(_normalized_ghas_only_key(slot.expected))
+
+    return evaluate_vulnerability_findings(expected_keys, actual_keys)
+
+
+def _find_expected_slot(
+    finding: VulnerabilityFinding,
+    finding_class: str | None,
+    slots: list[_ExpectedSlot],
+    line_window: int,
+) -> _ExpectedSlot | None:
+    if finding_class is None:
+        return None
+    finding_line = finding.primary_location.line_start or 0
+    for slot in slots:
+        if slot.consumed:
+            continue
+        if slot.vuln_class is None or slot.vuln_class != finding_class:
+            continue
+        if slot.expected.file_path != finding.primary_location.file_path:
+            continue
+        if not _line_window_match(
+            finding_line, slot.expected.line_start, line_window
+        ):
+            continue
+        return slot
+    return None
+
+
+def _normalized_matched_key(index: int) -> VulnerabilityExpectedFinding:
+    return VulnerabilityExpectedFinding(
+        file_path=f"__matched__/{index}",
+        line_start=index,
+        rule_id="__matched__",
+    )
+
+
+def _normalized_canonical_finding(
+    finding: VulnerabilityFinding, shared: VulnerabilityExpectedFinding
+) -> VulnerabilityFinding:
+    from security_scanner.core.vulnerability.model import VulnerabilityLocation
+
+    return VulnerabilityFinding(
+        finding_id=finding.finding_id,
+        rule_id=shared.rule_id,
+        message=finding.message,
+        primary_location=VulnerabilityLocation(
+            file_path=shared.file_path,
+            line_start=shared.line_start,
+        ),
+        source_tool=finding.source_tool,
+        cwe_ids=finding.cwe_ids,
+    )
+
+
+def _normalized_local_only_finding(
+    finding: VulnerabilityFinding, index: int
+) -> VulnerabilityFinding:
+    from security_scanner.core.vulnerability.model import VulnerabilityLocation
+
+    return VulnerabilityFinding(
+        finding_id=finding.finding_id,
+        rule_id=f"__local_only__/{index}",
+        message=finding.message,
+        primary_location=VulnerabilityLocation(
+            file_path=f"__local_only__/{index}",
+            line_start=index,
+        ),
+        source_tool=finding.source_tool,
+        cwe_ids=finding.cwe_ids,
+    )
+
+
+def _normalized_ghas_only_key(
+    expected: NormalizedExpectedFinding | VulnerabilityExpectedFinding,
+) -> VulnerabilityExpectedFinding:
+    return VulnerabilityExpectedFinding(
+        file_path=f"__expected_only__/{expected.file_path}",
+        line_start=expected.line_start,
+        rule_id=f"expected:{expected.rule_id}",
+    )
diff --git a/src/security_scanner/core/vulnerability/gate.py b/src/security_scanner/core/vulnerability/gate.py
index 58d7edd..fe93a77 100644
--- a/src/security_scanner/core/vulnerability/gate.py
+++ b/src/security_scanner/core/vulnerability/gate.py
@@ -1,9 +1,37 @@
-"""Gate policy for code vulnerability findings."""
+"""Gate policy for code vulnerability findings.
+
+M2 inline cheap FP-suppression tier (design §2 / §K) lives here as ADDITIVE,
+default-OFF opt-in signals. The crux of design §K is the default-on vs gated
+boundary:
+
+- **default-on** = ONLY deterministic, metadata-only changes that provably cannot
+  flip an already-blocking finding to non-blocking for the EXISTING default
+  thresholds. The existing gate already non-blocks INFO/LOW severity and
+  UNKNOWN/LOW precision; that is the entire default-on surface and M2 adds NOTHING
+  to it. Any new suppression that could change which findings block is gated.
+- **gated/opt-in** = the two new ``VulnerabilityGateThresholds`` flags below, both
+  DEFAULT OFF. With both off, :func:`evaluate_vulnerability_gate_policy` behaves
+  EXACTLY as before (same blocking set, same reason string) — the
+  default-invariance canary in ``tests/test_vulnerability_gate_tier.py`` pins this.
+
+The two opt-in signals (V-Q3: metadata-only, no validity-check analogue, no LLM,
+no network):
+
+- ``require_trace`` — a finding with ``code_flow_count == 0`` has NO data-flow
+  reachability evidence, so it is treated as non-blocking when the flag is ON. A
+  finding WITH a trace keeps blocking.
+- ``suppress_rules`` — a frozenset of canonical vuln *classes* (e.g.
+  ``"sql-injection"``) treated as non-blocking. Rule-class normalization REUSES
+  the M1 :class:`~security_scanner.core.vulnerability.codescan.RuleClassNormalizer`
+  (no duplicated normalizer here), so a CodeQL-style and a Semgrep-style rule.id
+  for the same class are suppressed together.
+"""
 
 from __future__ import annotations
 
-from dataclasses import dataclass
+from dataclasses import dataclass, field
 
+from security_scanner.core.vulnerability.codescan import RuleClassNormalizer
 from security_scanner.core.vulnerability.model import VulnerabilityFinding
 
 _SEVERITY_RANK = {
@@ -27,6 +55,10 @@ class VulnerabilityGateThresholds:
     max_blocking: int = 0
     severity_min: str = "HIGH"
     precision_min: str = "HIGH"
+    # --- M2 inline cheap tier: OPT-IN signals, DEFAULT OFF -------------------
+    # When both are at their defaults the gate behaves exactly as before.
+    require_trace: bool = False
+    suppress_rules: frozenset[str] = field(default_factory=frozenset)
 
 
 @dataclass(frozen=True)
@@ -41,10 +73,17 @@ def evaluate_vulnerability_gate_policy(
     findings: list[VulnerabilityFinding],
     thresholds: VulnerabilityGateThresholds | None = None,
 ) -> VulnerabilityGateResult:
-    """Evaluate code-vuln findings using severity + precision thresholds."""
+    """Evaluate code-vuln findings using severity + precision thresholds.
+
+    The M2 inline tier (``require_trace`` / ``suppress_rules``) only ever REMOVES
+    findings from the blocking set, and only when its opt-in flag is set. With
+    both flags at their defaults the blocking set and reason string are identical
+    to the pre-M2 behavior.
+    """
     policy = thresholds or VulnerabilityGateThresholds()
     severity_min = _normalize_severity(policy.severity_min)
     precision_min = _normalize_precision(policy.precision_min)
+    normalizer = RuleClassNormalizer() if policy.suppress_rules else None
     blocking = [
         finding
         for finding in findings
@@ -52,6 +91,8 @@ def evaluate_vulnerability_gate_policy(
             finding,
             severity_min=severity_min,
             precision_min=precision_min,
+            policy=policy,
+            normalizer=normalizer,
         )
     ]
     blocking_count = len(blocking)
@@ -76,14 +117,45 @@ def _is_blocking(
     *,
     severity_min: str,
     precision_min: str,
+    policy: VulnerabilityGateThresholds,
+    normalizer: RuleClassNormalizer | None,
 ) -> bool:
     if finding.triage_state == "FALSE_POSITIVE":
         return False
-    return (
+    base_blocking = (
         _SEVERITY_RANK.get(finding.severity, 0) >= _SEVERITY_RANK.get(severity_min, 3)
         and _PRECISION_RANK.get(finding.precision, 0)
         >= _PRECISION_RANK.get(precision_min, 3)
     )
+    if not base_blocking:
+        return False
+    # --- M2 inline cheap tier (opt-in suppression of an OTHERWISE-blocking
+    #     finding). Each branch is gated by its flag, so with defaults nothing
+    #     below changes the result.
+    if policy.require_trace and finding.code_flow_count == 0:
+        return False
+    if normalizer is not None and _rule_class_suppressed(
+        finding, policy.suppress_rules, normalizer
+    ):
+        return False
+    return True
+
+
+def _rule_class_suppressed(
+    finding: VulnerabilityFinding,
+    suppress_rules: frozenset[str],
+    normalizer: RuleClassNormalizer,
+) -> bool:
+    """True when the finding's canonical vuln class is in ``suppress_rules``.
+
+    Uses the shared M1 normalizer: CWE bridge first, then the rule-token
+    canonical class. No duplicated normalization logic.
+    """
+    cwe_class = normalizer.cwe_class(finding.cwe_ids)
+    if cwe_class is not None and cwe_class in suppress_rules:
+        return True
+    token_class = normalizer.rule_token_canonical(finding.rule_id)
+    return token_class is not None and token_class in suppress_rules
 
 
 def _normalize_severity(value: str) -> str:
diff --git a/tests/test_codescan_parity.py b/tests/test_codescan_parity.py
new file mode 100644
index 0000000..131230c
--- /dev/null
+++ b/tests/test_codescan_parity.py
@@ -0,0 +1,750 @@
+"""Adversarial parity tests for the GHAS code-scanning -> EvaluationKey adapter (M1).
+
+This is the 1:1 transfer of the proven secret-track parity matcher
+(``tests/test_ghas_parity.py``) to the code-vulnerability domain. The matcher
+synthesizes canonical keys and hands them to
+``core.vulnerability.evaluation.evaluate_vulnerability_findings`` for the headline
+precision/recall — ZERO new precision/recall formula lives here.
+
+Each test toggles OFF exactly one matcher responsibility and asserts a specific
+metric goes red:
+
+- normalization OFF (empty bridge + empty token map)
+      -> same-vuln-different-rule.id splits into FP + FN (recall/precision drop).
+- line-window OFF (tolerance 0)
+      -> the +/-N drift pair stops matching; the just-out-of-window negative
+        control MUST NOT match even with tolerance.
+- state filter
+      -> a dismissed / "false positive" alert our finding hits raises
+        ``dismissed_fp_hit`` AND becomes an FP; a "won't fix" alert is excluded
+        from the recall denominator (recall not penalized for missing it).
+
+The by-cwe vs by-rule-token tier counts and the CWE many-to-many 1:1 binding are
+asserted explicitly. The committed fixture loads via ``load_codescan_snapshot``;
+a non-``synthetic`` provenance marker fails closed.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from security_scanner.core.vulnerability.codescan import (
+    DEFAULT_CWE_BRIDGE,
+    CodeScanAlertRecord,
+    RuleClassNormalizer,
+    load_codescan_snapshot,
+)
+from security_scanner.core.vulnerability.codescan_parity import (
+    ParityConfig,
+    compare_codescan_alerts_with_findings,
+)
+from security_scanner.core.vulnerability.evaluation import (
+    VulnerabilityEvaluationResult,
+)
+from security_scanner.core.vulnerability.model import (
+    VulnerabilityFinding,
+    VulnerabilityLocation,
+    compute_vulnerability_finding_id,
+)
+
+REPO = "synthetic-org/synthetic-codescan-repo"
+# Concrete line-window N fixed in M1 (closes the open question VD-07).
+LINE_WINDOW = 2
+FIXTURE = (
+    Path(__file__).resolve().parents[1]
+    / "eval"
+    / "codescan-parity-corpus"
+    / "synthetic-snapshot.json"
+)
+
+
+def _normalizer(
+    *,
+    cwe_bridge=DEFAULT_CWE_BRIDGE,
+    enable_rule_token: bool = True,
+) -> RuleClassNormalizer:
+    return RuleClassNormalizer(
+        cwe_bridge=cwe_bridge,
+        enable_rule_token=enable_rule_token,
+    )
+
+
+def _alert(
+    *,
+    number: int,
+    rule_id: str,
+    path: str,
+    start_line: int,
+    end_line: int | None = None,
+    cwe_ids: tuple[str, ...] = (),
+    state: str = "open",
+    dismissed_reason: str | None = None,
+    severity: str | None = "high",
+) -> CodeScanAlertRecord:
+    return CodeScanAlertRecord(
+        repository=REPO,
+        alert_number=number,
+        rule_id=rule_id,
+        security_severity_level=severity,
+        cwe_ids=cwe_ids,
+        state=state,
+        dismissed_reason=dismissed_reason,
+        location_path=path,
+        location_start_line=start_line,
+        location_end_line=end_line if end_line is not None else start_line,
+    )
+
+
+def _finding(
+    *,
+    rule_id: str,
+    path: str,
+    line_start: int,
+    cwe_ids: tuple[str, ...] = (),
+) -> VulnerabilityFinding:
+    finding_id = compute_vulnerability_finding_id(
+        source_tool="semgrep",
+        rule_id=rule_id,
+        partial_fingerprints=None,
+        file_path=path,
+        line_start=line_start,
+        message="synthetic",
+    )
+    return VulnerabilityFinding(
+        finding_id=finding_id,
+        rule_id=rule_id,
+        message="synthetic finding",
+        primary_location=VulnerabilityLocation(
+            file_path=path, line_start=line_start
+        ),
+        source_tool="semgrep",
+        cwe_ids=cwe_ids,
+    )
+
+
+# ---------------------------------------------------------------------------
+# (c) CodeQL <-> Semgrep same vuln, different rule.id: matches by-cwe
+# ---------------------------------------------------------------------------
+
+def test_same_vuln_different_rule_id_matches_by_cwe():
+    """CodeQL ``py/sql-injection`` (CWE-89) vs Semgrep different rule.id, same CWE."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            start_line=10,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="python.lang.security.audit.sql-injection",
+            path="synthetic_app/handlers.py",
+            line_start=10,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert isinstance(result.detection, VulnerabilityEvaluationResult)
+    assert result.detection.true_positive_count == 1
+    assert result.detection.false_positive_count == 0
+    assert result.detection.false_negative_count == 0
+    assert result.detection.precision == 1.0
+    assert result.detection.recall == 1.0
+    assert result.matched_by_cwe == 1
+    assert result.matched_by_rule_token == 0
+    assert result.unmatched == 0
+
+
+def test_same_vuln_without_normalization_splits_red():
+    """RED-PROOF: empty bridge + token map -> the same vuln splits into FP + FN."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            start_line=10,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="python.lang.security.audit.sql-injection",
+            path="synthetic_app/handlers.py",
+            line_start=10,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        # normalization fully disabled: no CWE bridge, no rule-token rescue.
+        normalizer=_normalizer(cwe_bridge={}, enable_rule_token=False),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.detection.true_positive_count == 0
+    assert result.detection.false_negative_count == 1  # ghas-only alert
+    assert result.detection.false_positive_count == 1  # local-only finding
+    assert result.detection.recall < 1.0
+    assert result.detection.precision < 1.0
+    assert result.matched_by_cwe == 0
+
+
+# ---------------------------------------------------------------------------
+# (a) CWE-absent rule-token-only: matches ONLY via by-rule-token
+# ---------------------------------------------------------------------------
+
+def test_cwe_absent_matches_by_rule_token():
+    """Neither side carries a CWE -> match only through exact core-token set."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="js/path-injection",
+            path="synthetic_app/files.js",
+            start_line=22,
+            cwe_ids=(),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="javascript.lang.security.audit.path-traversal",
+            path="synthetic_app/files.js",
+            line_start=22,
+            cwe_ids=(),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.detection.true_positive_count == 1
+    assert result.matched_by_cwe == 0
+    assert result.matched_by_rule_token == 1
+    assert result.unmatched == 0
+
+
+def test_rule_token_requires_exact_core_set_no_partial_overlap():
+    """path-traversal MUST NOT match open-redirect (no partial-overlap matching)."""
+    normalizer = _normalizer()
+    path_set = normalizer.rule_token_class(
+        "javascript.lang.security.audit.path-traversal"
+    )
+    redirect_set = normalizer.rule_token_class("js/open-redirect")
+    assert path_set is not None
+    assert redirect_set is not None
+    assert path_set != redirect_set
+
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="js/open-redirect",
+            path="synthetic_app/files.js",
+            start_line=22,
+            cwe_ids=(),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="javascript.lang.security.audit.path-traversal",
+            path="synthetic_app/files.js",
+            line_start=22,
+            cwe_ids=(),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=normalizer,
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    # Different vuln classes at the same location must NOT cross-match.
+    assert result.detection.true_positive_count == 0
+    assert result.detection.false_positive_count == 1
+    assert result.detection.false_negative_count == 1
+    assert result.unmatched >= 1
+
+
+# ---------------------------------------------------------------------------
+# (b) source/sink line drift + negative control
+# ---------------------------------------------------------------------------
+
+def test_line_drift_within_window_matches():
+    """Finding at alert_line + N (just inside the window) matches."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            start_line=30,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            line_start=32,  # +2 == N, just inside
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.detection.true_positive_count == 1
+    assert result.detection.recall == 1.0
+
+
+def test_line_drift_without_window_goes_red():
+    """RED-PROOF: window 0 -> a +1 drift no longer matches."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            start_line=30,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/handlers.py",
+            line_start=31,  # +1
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=0),  # exact-line only
+    )
+
+    assert result.detection.true_positive_count == 0
+    assert result.detection.false_negative_count == 1
+    assert result.detection.false_positive_count == 1
+
+
+def test_window_boundary_negative_control():
+    """One drift just inside N, one just outside; outside MUST NOT match.
+
+    With N=2: drift +2 (40 -> 42) MUST match; drift +3 (60 -> 63) MUST NOT.
+    A too-greedy window that matched both would fail the must-NOT assertion.
+    """
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/inside.py",
+            start_line=40,
+            cwe_ids=("CWE-89",),
+        ),
+        _alert(
+            number=2,
+            rule_id="py/sql-injection",
+            path="synthetic_app/outside.py",
+            start_line=60,
+            cwe_ids=("CWE-89",),
+        ),
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/inside.py",
+            line_start=42,  # +2 in
+            cwe_ids=("CWE-89",),
+        ),
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/outside.py",
+            line_start=63,  # +3 out
+            cwe_ids=("CWE-89",),
+        ),
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.detection.true_positive_count == 1
+    assert result.detection.false_negative_count == 1  # outside alert
+    assert result.detection.false_positive_count == 1  # outside finding
+
+
+def test_interval_overlap_matches_multiline_alert():
+    """alert start..end interval overlap counts as a match even with window 0."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/multiline.py",
+            start_line=10,
+            end_line=14,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/multiline.py",
+            line_start=13,  # inside interval, >0 from start
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=0),
+    )
+
+    assert result.detection.true_positive_count == 1
+
+
+# ---------------------------------------------------------------------------
+# (d) dismissed_reason: state-aware denominator + FP-oracle
+# ---------------------------------------------------------------------------
+
+def test_dismissed_false_positive_hit_is_precision_penalty():
+    """Finding on a dismissed/false-positive alert -> dismissed_fp_hit + FP."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/open.py",
+            start_line=5,
+            cwe_ids=("CWE-89",),
+            state="open",
+        ),
+        _alert(
+            number=2,
+            rule_id="py/xss",
+            path="synthetic_app/dismissed.py",
+            start_line=8,
+            cwe_ids=("CWE-79",),
+            state="dismissed",
+            dismissed_reason="false positive",
+        ),
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/open.py",
+            line_start=5,
+            cwe_ids=("CWE-89",),
+        ),
+        # We (wrongly) surface a finding on the dismissed false-positive location.
+        _finding(
+            rule_id="py/xss",
+            path="synthetic_app/dismissed.py",
+            line_start=8,
+            cwe_ids=("CWE-79",),
+        ),
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    # The open alert is detected (TP); the dismissed-FP hit is a precision penalty.
+    assert result.detection.true_positive_count == 1
+    assert result.dismissed_fp_hit == 1
+    assert result.detection.false_positive_count == 1
+    assert result.detection.precision == 0.5
+    # The dismissed alert is NOT in the recall denominator.
+    assert result.detection.false_negative_count == 0
+    assert result.detection.recall == 1.0
+
+
+def test_wont_fix_excluded_from_recall_denominator():
+    """A "won't fix" alert we do not detect must NOT punish recall (TP-non-blocking)."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/open.py",
+            start_line=5,
+            cwe_ids=("CWE-89",),
+            state="open",
+        ),
+        _alert(
+            number=2,
+            rule_id="py/command-injection",
+            path="synthetic_app/wontfix.py",
+            start_line=12,
+            cwe_ids=("CWE-78",),
+            state="dismissed",
+            dismissed_reason="won't fix",
+        ),
+    ]
+    # We only detect the open one; the won't-fix alert is undetected.
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/open.py",
+            line_start=5,
+            cwe_ids=("CWE-89",),
+        )
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    # won't-fix is TP-non-blocking: not in recall denom, not a precision penalty.
+    assert result.detection.true_positive_count == 1
+    assert result.detection.false_negative_count == 0
+    assert result.detection.recall == 1.0
+    assert result.dismissed_fp_hit == 0
+
+
+def test_fixed_alert_in_recall_denominator():
+    """A ``fixed`` alert is positive truth (recall denominator)."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/ssrf",
+            path="synthetic_app/fetch.py",
+            start_line=7,
+            cwe_ids=("CWE-918",),
+            state="fixed",
+        )
+    ]
+    # We do NOT detect it -> it must be a false negative.
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=[],
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.detection.false_negative_count == 1
+    assert result.detection.recall == 0.0
+
+
+# ---------------------------------------------------------------------------
+# CWE many-to-many: SQLi + XSS at same (file, window) each bind 1:1
+# ---------------------------------------------------------------------------
+
+def test_cwe_many_to_many_binds_one_to_one():
+    """Two different vulns at the same window each bind to the right alert."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/mixed.py",
+            start_line=20,
+            cwe_ids=("CWE-89",),
+        ),
+        _alert(
+            number=2,
+            rule_id="py/xss",
+            path="synthetic_app/mixed.py",
+            start_line=21,
+            cwe_ids=("CWE-79",),
+        ),
+    ]
+    findings = [
+        _finding(
+            rule_id="py/xss",
+            path="synthetic_app/mixed.py",
+            line_start=21,
+            cwe_ids=("CWE-79",),
+        ),
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/mixed.py",
+            line_start=20,
+            cwe_ids=("CWE-89",),
+        ),
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    # Both bind 1:1, no cross-match, no leftover FP/FN.
+    assert result.detection.true_positive_count == 2
+    assert result.detection.false_positive_count == 0
+    assert result.detection.false_negative_count == 0
+    assert result.matched_by_cwe == 2
+
+
+# ---------------------------------------------------------------------------
+# meta-metrics: CWE-deficit + by-rule-token rescue rate
+# ---------------------------------------------------------------------------
+
+def test_meta_metrics_cwe_deficit_and_rescue_rate():
+    """A CWE-absent rule-token rescue raises both meta rates above 0."""
+    alerts = [
+        _alert(
+            number=1,
+            rule_id="py/sql-injection",
+            path="synthetic_app/a.py",
+            start_line=10,
+            cwe_ids=("CWE-89",),
+        ),
+        _alert(
+            number=2,
+            rule_id="js/path-injection",
+            path="synthetic_app/b.js",
+            start_line=20,
+            cwe_ids=(),  # CWE-deficient
+        ),
+    ]
+    findings = [
+        _finding(
+            rule_id="py/sql-injection",
+            path="synthetic_app/a.py",
+            line_start=10,
+            cwe_ids=("CWE-89",),
+        ),
+        _finding(
+            rule_id="javascript.lang.security.audit.path-traversal",
+            path="synthetic_app/b.js",
+            line_start=20,
+            cwe_ids=(),  # CWE-deficient
+        ),
+    ]
+
+    result = compare_codescan_alerts_with_findings(
+        repository=REPO,
+        alerts=alerts,
+        findings=findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert result.matched_by_cwe == 1
+    assert result.matched_by_rule_token == 1
+    # one of two truth alerts lacks a bridgeable CWE.
+    assert result.cwe_deficit_rate == 0.5
+    # one of two matches was rescued by rule-token.
+    assert result.rule_token_rescue_rate == 0.5
+
+
+# ---------------------------------------------------------------------------
+# fixture: provenance fail-closed + end-to-end adversarial snapshot
+# ---------------------------------------------------------------------------
+
+def test_provenance_marker_required_fail_closed(tmp_path):
+    """A snapshot without source: synthetic must fail closed."""
+    bad = tmp_path / "no-provenance.json"
+    bad.write_text(
+        '{"repoFullName": "synthetic-org/x", "alerts": [], "findings": []}',
+        encoding="utf-8",
+    )
+
+    with pytest.raises(ValueError, match="synthetic"):
+        load_codescan_snapshot(bad)
+
+
+def test_committed_fixture_loads_and_matches():
+    """End-to-end over the committed adversarial snapshot.
+
+    The fixture is engineered so normalization (by-cwe + by-rule-token), the
+    line-window, and the state filter produce a clean, high-recall picture, with
+    one dismissed-FP hit (precision penalty) and one won't-fix alert excluded
+    from the recall denominator.
+    """
+    snapshot = load_codescan_snapshot(FIXTURE)
+    assert snapshot.source == "synthetic"
+    assert snapshot.repo_full_name == REPO
+
+    result = compare_codescan_alerts_with_findings(
+        repository=snapshot.repo_full_name,
+        alerts=snapshot.alerts,
+        findings=snapshot.findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+
+    assert isinstance(result.detection, VulnerabilityEvaluationResult)
+    # Positive-truth alerts (open + fixed) are all detected -> perfect recall.
+    assert result.detection.recall == 1.0
+    # by-cwe and by-rule-token tiers both exercised.
+    assert result.matched_by_cwe >= 1
+    assert result.matched_by_rule_token >= 1
+    # one dismissed false-positive location we surfaced -> precision penalty.
+    assert result.dismissed_fp_hit == 1
+    assert result.detection.false_positive_count >= 1
+    assert result.detection.precision < 1.0
+    # meta-metrics exposed.
+    assert 0.0 <= result.cwe_deficit_rate <= 1.0
+    assert 0.0 <= result.rule_token_rescue_rate <= 1.0
+
+
+def test_fixture_states_drive_red_when_filter_disabled():
+    """RED-PROOF over the fixture: counting dismissed alerts as truth drops recall."""
+    snapshot = load_codescan_snapshot(FIXTURE)
+
+    with_filter = compare_codescan_alerts_with_findings(
+        repository=snapshot.repo_full_name,
+        alerts=snapshot.alerts,
+        findings=snapshot.findings,
+        normalizer=_normalizer(),
+        config=ParityConfig(line_window=LINE_WINDOW),
+    )
+    without_filter = compare_codescan_alerts_with_findings(
+        repository=snapshot.repo_full_name,
+        alerts=snapshot.alerts,
+        findings=snapshot.findings,
+        normalizer=_normalizer(),
+        # state filter OFF: every state counts as positive truth.
+        config=ParityConfig(
+            line_window=LINE_WINDOW,
+            positive_truth_states=("open", "fixed", "dismissed"),
+        ),
+    )
+
+    assert with_filter.detection.recall == 1.0
+    assert without_filter.detection.recall < 1.0
diff --git a/tests/test_governance_vuln_parity_slo.py b/tests/test_governance_vuln_parity_slo.py
new file mode 100644
index 0000000..86285f2
--- /dev/null
+++ b/tests/test_governance_vuln_parity_slo.py
@@ -0,0 +1,379 @@
+"""Tests for the M3 vuln code-scanning parity SLO gate (report-only until threshold).
+
+Mirrors ``tests/test_governance_parity_slo.py`` for the code-vulnerability domain.
+Exercises the three documented modes — report-only (no threshold), enforce
+(threshold committed), and stale-degraded (snapshot too old) — plus the
+provenance fail-closed guard on the snapshot input. All mutation fixtures are
+synthetic and written into a tmp dir so the committed corpus is never the subject;
+a separate test asserts the COMMITTED corpus runs clean in report-only.
+"""
+
+from __future__ import annotations
+
+import datetime as dt
+import json
+from pathlib import Path
+
+import pytest
+
+from governance.vuln_parity_slo import (
+    discover_snapshots,
+    evaluate_vuln_parity_slo,
+    load_thresholds,
+    main,
+    render_report,
+)
+
+NOW = dt.datetime(2026, 6, 21, 12, 0, tzinfo=dt.timezone.utc)
+
+
+def _snapshot_dict(
+    *,
+    repo: str = "synthetic-org/synthetic-codescan-repo",
+    fetched_at: str = "2026-06-20T12:00:00+00:00",
+    matched: bool = True,
+    extra_alerts: list[dict] | None = None,
+    extra_findings: list[dict] | None = None,
+) -> dict:
+    # One open SQLi alert (CWE-89) and (when matched) one local finding at the
+    # same location/CWE-class, so per-repo precision/recall = 1.0; when not
+    # matched the finding is dropped so recall drops (used for the enforce-fail
+    # case).
+    findings: list[dict] = []
+    if matched:
+        findings = [
+            {
+                "ruleId": "python.lang.security.audit.sql-injection",
+                "sourceTool": "semgrep",
+                "cweIds": ["CWE-89"],
+                "filePath": "synthetic_app/handlers.py",
+                "lineStart": 10,
+            }
+        ]
+    alerts = [
+        {
+            "alertNumber": 1,
+            "ruleId": "py/sql-injection",
+            "securitySeverityLevel": "high",
+            "cweIds": ["CWE-89"],
+            "state": "open",
+            "filePath": "synthetic_app/handlers.py",
+            "lineStart": 10,
+            "lineEnd": 10,
+        }
+    ]
+    if extra_alerts:
+        alerts.extend(extra_alerts)
+    if extra_findings:
+        findings.extend(extra_findings)
+    return {
+        "schemaVersion": 1,
+        "source": "synthetic",
+        "repoFullName": repo,
+        "fetchedAt": fetched_at,
+        "alerts": alerts,
+        "findings": findings,
+    }
+
+
+def _write_snapshot(
+    directory: Path, data: dict, name: str = "synthetic-snapshot.json"
+) -> Path:
+    directory.mkdir(parents=True, exist_ok=True)
+    path = directory / name
+    path.write_text(json.dumps(data), encoding="utf-8")
+    return path
+
+
+# --------------------------------------------------------------------------- #
+# report-only (no threshold)                                                  #
+# --------------------------------------------------------------------------- #
+
+
+def test_report_only_when_no_threshold_file(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict())
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir,
+        threshold_path=tmp_path / "absent.yml",
+        now=NOW,
+    )
+
+    assert result.mode == "report-only"
+    assert result.passed is True  # report-only NEVER blocks
+    assert result.macro.macro_precision == 1.0
+    assert result.macro.macro_recall == 1.0
+
+
+def test_report_only_passes_even_when_below_would_be_target(tmp_path):
+    # A recall miss (unmatched) in report-only still exits 0: there is no
+    # committed target to enforce yet (measure-first).
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict(matched=False))
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=tmp_path / "absent.yml", now=NOW
+    )
+
+    assert result.mode == "report-only"
+    assert result.macro.macro_recall < 1.0
+    assert result.passed is True
+
+
+def test_empty_threshold_file_is_report_only(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict())
+    threshold = tmp_path / "thresholds.yml"
+    threshold.write_text("", encoding="utf-8")
+
+    assert load_thresholds(threshold) is None
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=threshold, now=NOW
+    )
+    assert result.mode == "report-only"
+
+
+# --------------------------------------------------------------------------- #
+# enforce (threshold committed)                                               #
+# --------------------------------------------------------------------------- #
+
+
+def test_enforce_passes_when_macro_meets_threshold(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict())
+    threshold = tmp_path / "thresholds.yml"
+    threshold.write_text("precision_min: 0.9\nrecall_min: 0.9\n", encoding="utf-8")
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=threshold, now=NOW
+    )
+
+    assert result.mode == "enforce"
+    assert result.passed is True
+    assert result.failures == ()
+
+
+def test_enforce_fails_when_macro_below_threshold(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict(matched=False))  # recall < 1.0
+    threshold = tmp_path / "thresholds.yml"
+    threshold.write_text("precision_min: 0.9\nrecall_min: 0.99\n", encoding="utf-8")
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=threshold, now=NOW
+    )
+
+    assert result.mode == "enforce"
+    assert result.passed is False
+    assert any("recall" in f for f in result.failures)
+
+
+# --------------------------------------------------------------------------- #
+# stale-degraded (snapshot too old)                                           #
+# --------------------------------------------------------------------------- #
+
+
+def test_stale_in_report_only_warns_but_passes(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(
+        snap_dir, _snapshot_dict(fetched_at="2025-01-01T00:00:00+00:00")
+    )
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir,
+        threshold_path=tmp_path / "absent.yml",
+        now=NOW,
+        max_age_days=90,
+    )
+
+    assert result.stale is True
+    assert result.mode == "report-only"
+    assert result.passed is True  # surfaced, not silently passed, not blocking
+
+
+def test_stale_in_enforce_fails_not_silent_pass(tmp_path):
+    # design staleness-passive-only: a stale snapshot must NOT silently satisfy
+    # an enforcing gate even when the numbers look fine.
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(
+        snap_dir, _snapshot_dict(fetched_at="2025-01-01T00:00:00+00:00")
+    )
+    threshold = tmp_path / "thresholds.yml"
+    threshold.write_text("precision_min: 0.9\nrecall_min: 0.9\n", encoding="utf-8")
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=threshold, now=NOW, max_age_days=90
+    )
+
+    assert result.stale is True
+    assert result.mode == "enforce"
+    assert result.passed is False
+    assert any("stale-degraded" in f for f in result.failures)
+
+
+def test_missing_fetched_at_is_treated_as_stale(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    data = _snapshot_dict()
+    del data["fetchedAt"]
+    _write_snapshot(snap_dir, data)
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=tmp_path / "absent.yml", now=NOW
+    )
+    assert result.stale is True
+
+
+# --------------------------------------------------------------------------- #
+# provenance fail-closed                                                       #
+# --------------------------------------------------------------------------- #
+
+
+def test_non_synthetic_snapshot_fails_closed(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    data = _snapshot_dict()
+    data["source"] = "real"  # not synthetic -> load must fail closed
+    _write_snapshot(snap_dir, data)
+
+    with pytest.raises(Exception):
+        evaluate_vuln_parity_slo(
+            snapshot_dir=snap_dir, threshold_path=tmp_path / "absent.yml", now=NOW
+        )
+
+
+# --------------------------------------------------------------------------- #
+# dismissed_fp_hit and won't-fix surfacing                                     #
+# --------------------------------------------------------------------------- #
+
+
+def test_dismissed_fp_hit_and_wont_fix_surface_in_report(tmp_path):
+    """A finding on a dismissed-FP alert is a precision penalty surfaced in the
+    report; a won't-fix alert is excluded from recall with no penalty.
+    """
+    snap_dir = tmp_path / "corpus"
+    data = _snapshot_dict(
+        extra_alerts=[
+            {
+                "alertNumber": 2,
+                "ruleId": "py/xss",
+                "cweIds": ["CWE-79"],
+                "state": "dismissed",
+                "dismissedReason": "false positive",
+                "filePath": "synthetic_app/legacy.py",
+                "lineStart": 55,
+                "lineEnd": 55,
+            },
+            {
+                "alertNumber": 3,
+                "ruleId": "py/command-injection",
+                "cweIds": ["CWE-78"],
+                "state": "dismissed",
+                "dismissedReason": "won't fix",
+                "filePath": "synthetic_app/ops.py",
+                "lineStart": 70,
+                "lineEnd": 70,
+            },
+        ],
+        extra_findings=[
+            {
+                "ruleId": "py/xss",
+                "sourceTool": "codeql",
+                "cweIds": ["CWE-79"],
+                "filePath": "synthetic_app/legacy.py",
+                "lineStart": 55,
+            }
+        ],
+    )
+    _write_snapshot(snap_dir, data)
+
+    result = evaluate_vuln_parity_slo(
+        snapshot_dir=snap_dir, threshold_path=tmp_path / "absent.yml", now=NOW
+    )
+    report = render_report(result)
+
+    # The dismissed-FP hit is exercised (the codeql/xss finding lands on the
+    # dismissed false-positive alert) -> precision penalty + dismissed_fp_hit.
+    assert result.total_dismissed_fp_hit >= 1
+    assert "Dismissed-FP hit" in report
+    # won't-fix alert is excluded from the recall denominator: recall stays 1.0
+    # (the open SQLi alert is still matched).
+    assert result.macro.macro_recall == 1.0
+    assert result.macro.macro_precision < 1.0
+
+
+# --------------------------------------------------------------------------- #
+# CLI exit codes + committed corpus                                           #
+# --------------------------------------------------------------------------- #
+
+
+def test_cli_check_report_only_exits_zero(tmp_path, capsys):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict())
+
+    code = main(
+        [
+            "--check",
+            "--snapshot-dir",
+            str(snap_dir),
+            "--threshold-path",
+            str(tmp_path / "absent.yml"),
+        ]
+    )
+    out = capsys.readouterr().out
+    assert code == 0
+    assert "report-only" in out
+
+
+def test_cli_json_report_only(tmp_path, capsys):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict())
+
+    code = main(
+        [
+            "--check",
+            "--json",
+            "--snapshot-dir",
+            str(snap_dir),
+            "--threshold-path",
+            str(tmp_path / "absent.yml"),
+        ]
+    )
+    out = capsys.readouterr().out
+    assert code == 0
+    payload = json.loads(out)
+    assert payload["mode"] == "report-only"
+    assert payload["passed"] is True
+
+
+def test_committed_corpus_runs_report_only(tmp_path):
+    # The committed eval/codescan-parity-corpus snapshot must drive the gate in
+    # report-only with no committed thresholds (autonomous layer is always
+    # report-only).
+    result = evaluate_vuln_parity_slo(threshold_path=tmp_path / "absent.yml", now=NOW)
+    assert result.mode == "report-only"
+    assert result.snapshot_count >= 1
+    assert result.passed is True
+
+
+def test_committed_corpus_cli_check_exits_zero():
+    # The autonomous acceptance check: ``--check`` on the committed corpus exits 0
+    # in report-only with no thresholds present.
+    code = main(["--check"])
+    assert code == 0
+
+
+def test_committed_corpus_exercises_dismissed_fp_hit(tmp_path):
+    # Post-M2 review #3: dismissed_fp_hit must not be dark-launched. The committed
+    # fixture has a dismissed false-positive alert that a finding lands on.
+    result = evaluate_vuln_parity_slo(threshold_path=tmp_path / "absent.yml", now=NOW)
+    assert result.total_dismissed_fp_hit >= 1
+    assert "Dismissed-FP hit" in render_report(result)
+
+
+def test_discover_snapshots_is_deterministic(tmp_path):
+    snap_dir = tmp_path / "corpus"
+    _write_snapshot(snap_dir, _snapshot_dict(), name="b-snapshot.json")
+    _write_snapshot(snap_dir, _snapshot_dict(), name="a-snapshot.json")
+
+    found = discover_snapshots(snap_dir)
+    assert [p.name for p in found] == ["a-snapshot.json", "b-snapshot.json"]
diff --git a/tests/test_vulnerability_corpus_normalized.py b/tests/test_vulnerability_corpus_normalized.py
new file mode 100644
index 0000000..3edc7aa
--- /dev/null
+++ b/tests/test_vulnerability_corpus_normalized.py
@@ -0,0 +1,243 @@
+"""M2 synthetic 5-class corpus + normalization-aware evaluation path tests.
+
+Two boundaries are under test:
+
+1. **5-class corpus (design VD-07)** — ``eval/synthetic-code-vuln/`` covers
+   SQLi / XSS / path-traversal / command-injection / SSRF, each with a
+   vulnerable case (expected finding) AND a safe case (must NOT be flagged so it
+   exercises precision). Evaluating the corpus' own findings against its own
+   expected list yields recall>=0.99 and precision>=0.90 (the existing
+   ``VulnerabilityEvaluationThresholds``).
+
+2. **Normalization-aware path (design §E)** — ``evaluate_vulnerability_findings_
+   normalized`` reuses the M1 ``RuleClassNormalizer`` + line-window so a
+   CodeQL-style ``ruleId`` and a Semgrep-style ``ruleId`` for the SAME class
+   still match. CRUCIALLY this is a NEW function: the existing
+   ``evaluate_vulnerability_findings`` exact-key behavior is untouched, proven by
+   a contrast test where the exact-key path splits the same pair into FP+FN.
+
+The adversarial out-of-rule pair (a CWE class with no bridge + no rule-token)
+intentionally goes recall<1 on the normalized path — that red is CORRECT and is
+asserted as an expected-fail so a future silent normalization regression is
+caught.
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+from security_scanner.core.vulnerability.codescan import RuleClassNormalizer
+from security_scanner.core.vulnerability.evaluation import (
+    VulnerabilityEvaluationThresholds,
+    evaluate_vulnerability_findings,
+    evaluate_vulnerability_findings_normalized,
+    evaluate_vulnerability_gate,
+    load_vulnerability_corpus_normalized,
+)
+from security_scanner.core.vulnerability.model import (
+    VulnerabilityFinding,
+    VulnerabilityLocation,
+    compute_vulnerability_finding_id,
+)
+
+CORPUS = (
+    Path(__file__).resolve().parents[1]
+    / "eval"
+    / "synthetic-code-vuln"
+    / "corpus-snapshot.json"
+)
+
+FIVE_CLASSES = {
+    "sql-injection",
+    "xss",
+    "path-traversal",
+    "command-injection",
+    "ssrf",
+}
+
+
+def _load_corpus() -> dict:
+    return json.loads(CORPUS.read_text(encoding="utf-8"))
+
+
+def _finding_from_dict(item: dict) -> VulnerabilityFinding:
+    file_path = str(item["filePath"])
+    line_start = int(item["lineStart"])
+    rule_id = str(item["ruleId"])
+    source_tool = str(item.get("sourceTool", "semgrep"))
+    cwe_ids = tuple(str(c) for c in item.get("cweIds", []))
+    finding_id = compute_vulnerability_finding_id(
+        source_tool=source_tool,
+        rule_id=rule_id,
+        partial_fingerprints=None,
+        file_path=file_path,
+        line_start=line_start,
+        message="synthetic finding",
+    )
+    return VulnerabilityFinding(
+        finding_id=finding_id,
+        rule_id=rule_id,
+        message="synthetic finding",
+        primary_location=VulnerabilityLocation(
+            file_path=file_path,
+            line_start=line_start,
+            line_end=item.get("lineEnd"),
+        ),
+        source_tool=source_tool,
+        cwe_ids=cwe_ids,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Corpus shape: provenance + 5 classes + safe cases
+# ---------------------------------------------------------------------------
+
+
+def test_corpus_provenance_is_synthetic():
+    data = _load_corpus()
+    assert str(data["source"]).strip().lower() == "synthetic"
+
+
+def test_corpus_covers_five_cwe_classes():
+    data = _load_corpus()
+    classes = {str(c["vulnClass"]) for c in data["expectedFindings"]}
+    assert classes == FIVE_CLASSES
+
+
+def test_corpus_has_safe_cases_per_class():
+    """Each class has at least one safe case that must NOT be flagged."""
+    data = _load_corpus()
+    safe_classes = {str(c["vulnClass"]) for c in data["safeCases"]}
+    assert FIVE_CLASSES <= safe_classes
+
+
+def test_corpus_paths_are_synthetic():
+    data = _load_corpus()
+    paths = [c["filePath"] for c in data["expectedFindings"]]
+    paths += [c["filePath"] for c in data["safeCases"]]
+    paths += [c["filePath"] for c in data["actualFindings"]]
+    assert all(p.startswith("synthetic_app/") for p in paths)
+
+
+# ---------------------------------------------------------------------------
+# Normalized corpus evaluation: recall>=0.99 / precision>=0.90
+# ---------------------------------------------------------------------------
+
+
+def test_normalized_corpus_meets_recall_and_precision_slo():
+    """The corpus' own findings hit recall>=0.99 + precision>=0.90 (default gate).
+
+    The actual findings deliberately use a DIFFERENT tool dialect than expected
+    for at least one class, so only the normalization-aware path can match them.
+    Safe-case findings are excluded (none flagged), so precision stays high.
+    """
+    expected, actual = load_vulnerability_corpus_normalized(CORPUS)
+    result = evaluate_vulnerability_findings_normalized(
+        expected, actual, normalizer=RuleClassNormalizer()
+    )
+    gate = evaluate_vulnerability_gate(result, VulnerabilityEvaluationThresholds())
+    assert gate.passed, gate.reason
+    assert result.recall >= 0.99
+    assert result.precision >= 0.90
+
+
+def test_normalized_path_matches_cross_dialect_pair():
+    """CodeQL-style actual ruleId matches a Semgrep-style expected ruleId."""
+    expected = load_vulnerability_corpus_normalized(CORPUS)[0]
+    # A CodeQL-style finding for SQLi at the canonical line.
+    actual = [
+        _finding_from_dict(
+            {
+                "filePath": "synthetic_app/handlers.py",
+                "lineStart": 42,
+                "ruleId": "py/sql-injection",
+                "sourceTool": "codeql",
+                "cweIds": ["CWE-89"],
+            }
+        )
+    ]
+    result = evaluate_vulnerability_findings_normalized(
+        expected, actual, normalizer=RuleClassNormalizer()
+    )
+    # At least the SQLi pair is a true positive via the normalizer.
+    assert result.true_positive_count >= 1
+
+
+def test_exact_key_path_splits_cross_dialect_pair():
+    """CONTRAST: the EXISTING exact-key path does NOT match the cross-dialect pair.
+
+    This pins the boundary: the normalization-aware path is additive; the legacy
+    ``evaluate_vulnerability_findings`` exact-key behavior is unchanged (a
+    different ruleId => FP + FN, not a TP).
+    """
+    from security_scanner.core.vulnerability.evaluation import (
+        VulnerabilityExpectedFinding,
+    )
+
+    expected = [
+        VulnerabilityExpectedFinding(
+            file_path="synthetic_app/handlers.py",
+            line_start=42,
+            rule_id="python.lang.security.audit.sql-injection",
+        )
+    ]
+    actual = [
+        _finding_from_dict(
+            {
+                "filePath": "synthetic_app/handlers.py",
+                "lineStart": 42,
+                "ruleId": "py/sql-injection",
+                "sourceTool": "codeql",
+                "cweIds": ["CWE-89"],
+            }
+        )
+    ]
+    result = evaluate_vulnerability_findings(expected, actual)
+    # Exact-key mismatch: ruleId differs => no TP, one FP and one FN.
+    assert result.true_positive_count == 0
+    assert result.false_positive_count == 1
+    assert result.false_negative_count == 1
+
+
+# ---------------------------------------------------------------------------
+# Adversarial out-of-rule pair: recall<1 is INTENDED (kept as a red guard)
+# ---------------------------------------------------------------------------
+
+
+def test_out_of_rule_class_recall_below_one_is_intended():
+    """An out-of-rule CWE (no bridge, no rule-token) is an INTENDED miss.
+
+    design §F: an independently-authored adversarial vuln whose class is NOT in
+    the normalizer must NOT be rescued — recall<1 here is the correct red and is
+    asserted so a future over-broad normalizer (silently matching it) is caught.
+    """
+    from security_scanner.core.vulnerability.evaluation import (
+        VulnerabilityExpectedFinding,
+    )
+
+    # An out-of-rule class: deserialization (CWE-502) — no bridge row, opaque token.
+    expected = [
+        VulnerabilityExpectedFinding(
+            file_path="synthetic_app/out_of_rule.py",
+            line_start=7,
+            rule_id="py/unsafe-deserialization",
+        )
+    ]
+    actual = [
+        _finding_from_dict(
+            {
+                "filePath": "synthetic_app/out_of_rule.py",
+                "lineStart": 7,
+                "ruleId": "python.lang.security.audit.pickle-load",
+                "sourceTool": "semgrep",
+                "cweIds": ["CWE-502"],
+            }
+        )
+    ]
+    result = evaluate_vulnerability_findings_normalized(
+        expected, actual, normalizer=RuleClassNormalizer()
+    )
+    # No bridge + non-matching rule tokens => the pair does NOT match.
+    assert result.recall < 1.0
+    assert result.false_negative_count == 1
diff --git a/tests/test_vulnerability_gate_tier.py b/tests/test_vulnerability_gate_tier.py
new file mode 100644
index 0000000..46c5ef5
--- /dev/null
+++ b/tests/test_vulnerability_gate_tier.py
@@ -0,0 +1,220 @@
+"""M2 inline cheap FP-suppression tier tests (gate-layer ONLY).
+
+The #1 acceptance constraint (design §K, stop-condition
+``existing-secret-default-behavior-change``) is that the EXISTING default gate
+behavior must not change. So the first test is the default-invariance canary:
+``evaluate_vulnerability_gate_policy`` with default thresholds (all new opt-in
+flags OFF) produces EXACTLY today's verdict.
+
+The inline tier adds two opt-in signals to ``VulnerabilityGateThresholds``, both
+DEFAULT OFF:
+
+- ``require_trace`` — a finding with ``code_flow_count == 0`` (no data-flow
+  reachability evidence) is treated as non-blocking. A finding WITH a trace
+  keeps blocking.
+- ``suppress_rules`` — a frozenset of canonical vuln *classes* (reusing the M1
+  ``RuleClassNormalizer``) whose findings are treated as non-blocking
+  (low-confidence rule suppression).
+
+When both flags are off the policy is byte-identical to today: no default-on
+behavior change, so a default-on change can never flip a currently-blocking
+finding. The suppression-rate regression test proves a canary TP is never
+suppressed by anything default-on.
+"""
+
+from __future__ import annotations
+
+from security_scanner.core.vulnerability.gate import (
+    VulnerabilityGateThresholds,
+    evaluate_vulnerability_gate_policy,
+)
+from security_scanner.core.vulnerability.model import (
+    VulnerabilityFinding,
+    VulnerabilityLocation,
+)
+
+
+def _finding(**overrides) -> VulnerabilityFinding:
+    defaults = dict(
+        finding_id="vuln_canary",
+        source_tool="semgrep",
+        rule_id="python.lang.security.audit.sql-injection",
+        message="Potential SQL injection.",
+        severity="HIGH",
+        precision="HIGH",
+        cwe_ids=("CWE-89",),
+        code_flow_count=1,
+        primary_location=VulnerabilityLocation(
+            file_path="synthetic_app/handlers.py",
+            line_start=42,
+        ),
+    )
+    defaults.update(overrides)
+    return VulnerabilityFinding(**defaults)
+
+
+# ---------------------------------------------------------------------------
+# Default-invariance canary (MUST stay green — write FIRST)
+# ---------------------------------------------------------------------------
+
+
+def test_default_thresholds_block_high_high_finding_unchanged():
+    """A HIGH/HIGH finding still blocks under the existing default policy."""
+    result = evaluate_vulnerability_gate_policy([_finding()])
+    assert result.passed is False
+    assert result.blocking_count == 1
+
+
+def test_default_thresholds_nonblock_info_low_unchanged():
+    """INFO/LOW + UNKNOWN precision is still non-blocking (existing default)."""
+    findings = [
+        _finding(finding_id="v_info", severity="INFO", precision="UNKNOWN"),
+        _finding(finding_id="v_low", severity="LOW", precision="LOW"),
+    ]
+    result = evaluate_vulnerability_gate_policy(findings)
+    assert result.passed is True
+    assert result.blocking_count == 0
+
+
+def test_default_ignores_code_flow_count_and_rule_class():
+    """With flags OFF, a HIGH/HIGH finding blocks regardless of trace count.
+
+    Proves the new signals are inert by default — a HIGH/HIGH finding with NO
+    trace (``code_flow_count == 0``) still blocks under the default policy, so
+    no default-on change can have silently suppressed it.
+    """
+    no_trace = _finding(finding_id="v_no_trace", code_flow_count=0)
+    result = evaluate_vulnerability_gate_policy([no_trace])
+    assert result.passed is False
+    assert result.blocking_count == 1
+
+
+def test_explicit_default_thresholds_equal_implicit():
+    """Constructing default thresholds explicitly equals passing None."""
+    findings = [_finding(), _finding(finding_id="v2", severity="LOW")]
+    implicit = evaluate_vulnerability_gate_policy(findings)
+    explicit = evaluate_vulnerability_gate_policy(
+        findings, VulnerabilityGateThresholds()
+    )
+    assert implicit == explicit
+
+
+def test_new_flags_default_off():
+    """The new opt-in flags default OFF on the dataclass."""
+    policy = VulnerabilityGateThresholds()
+    assert policy.require_trace is False
+    assert policy.suppress_rules == frozenset()
+
+
+# ---------------------------------------------------------------------------
+# Inline tier (gated): require_trace
+# ---------------------------------------------------------------------------
+
+
+def test_require_trace_suppresses_high_finding_with_no_trace():
+    """With ``require_trace`` ON, a HIGH finding with no trace is non-blocking."""
+    no_trace = _finding(finding_id="v_no_trace", code_flow_count=0)
+    gated = evaluate_vulnerability_gate_policy(
+        [no_trace], VulnerabilityGateThresholds(require_trace=True)
+    )
+    assert gated.passed is True
+    assert gated.blocking_count == 0
+
+
+def test_require_trace_keeps_high_finding_with_trace_blocking():
+    """``require_trace`` does NOT suppress a finding that HAS a data-flow trace."""
+    with_trace = _finding(finding_id="v_trace", code_flow_count=2)
+    gated = evaluate_vulnerability_gate_policy(
+        [with_trace], VulnerabilityGateThresholds(require_trace=True)
+    )
+    assert gated.passed is False
+    assert gated.blocking_count == 1
+
+
+def test_require_trace_off_keeps_no_trace_finding_blocking():
+    """Flag OFF (default): the no-trace HIGH finding still blocks."""
+    no_trace = _finding(finding_id="v_no_trace", code_flow_count=0)
+    default = evaluate_vulnerability_gate_policy([no_trace])
+    assert default.passed is False
+    assert default.blocking_count == 1
+
+
+# ---------------------------------------------------------------------------
+# Inline tier (gated): suppress_rules (rule-class via M1 normalizer)
+# ---------------------------------------------------------------------------
+
+
+def test_suppress_rules_suppresses_matching_class():
+    """A finding whose canonical class is suppressed is non-blocking when ON."""
+    finding = _finding(rule_id="py/sql-injection", cwe_ids=("CWE-89",))
+    gated = evaluate_vulnerability_gate_policy(
+        [finding],
+        VulnerabilityGateThresholds(suppress_rules=frozenset({"sql-injection"})),
+    )
+    assert gated.passed is True
+    assert gated.blocking_count == 0
+
+
+def test_suppress_rules_canonicalizes_across_tool_dialects():
+    """Both CodeQL- and Semgrep-style rule.ids fold onto the same class.
+
+    Suppressing ``sql-injection`` must catch BOTH ``py/sql-injection`` and
+    ``python.lang.security.audit.sql-injection`` because they normalize via the
+    shared M1 ``RuleClassNormalizer`` onto one class.
+    """
+    codeql = _finding(finding_id="v_ql", rule_id="py/sql-injection", cwe_ids=())
+    semgrep = _finding(
+        finding_id="v_sg",
+        rule_id="python.lang.security.audit.sql-injection",
+        cwe_ids=(),
+    )
+    policy = VulnerabilityGateThresholds(suppress_rules=frozenset({"sql-injection"}))
+    gated = evaluate_vulnerability_gate_policy([codeql, semgrep], policy)
+    assert gated.passed is True
+    assert gated.blocking_count == 0
+
+
+def test_suppress_rules_does_not_touch_other_classes():
+    """Suppressing one class does not suppress a different class."""
+    xss = _finding(finding_id="v_xss", rule_id="py/xss", cwe_ids=("CWE-79",))
+    policy = VulnerabilityGateThresholds(suppress_rules=frozenset({"sql-injection"}))
+    gated = evaluate_vulnerability_gate_policy([xss], policy)
+    assert gated.passed is False
+    assert gated.blocking_count == 1
+
+
+def test_suppress_rules_off_keeps_finding_blocking():
+    """Flag OFF (default empty set): nothing suppressed."""
+    finding = _finding(rule_id="py/sql-injection", cwe_ids=("CWE-89",))
+    default = evaluate_vulnerability_gate_policy([finding])
+    assert default.passed is False
+    assert default.blocking_count == 1
+
+
+# ---------------------------------------------------------------------------
+# Safe-code finding stays non-blocking; canary TP preserved
+# ---------------------------------------------------------------------------
+
+
+def test_safe_code_finding_stays_non_blocking_in_all_modes():
+    """A LOW/UNKNOWN 'safe-code' finding is non-blocking with or without flags."""
+    safe = _finding(finding_id="v_safe", severity="LOW", precision="UNKNOWN")
+    for policy in (
+        VulnerabilityGateThresholds(),
+        VulnerabilityGateThresholds(require_trace=True),
+        VulnerabilityGateThresholds(suppress_rules=frozenset({"sql-injection"})),
+    ):
+        result = evaluate_vulnerability_gate_policy([safe], policy)
+        assert result.passed is True
+        assert result.blocking_count == 0
+
+
+def test_canary_true_positive_never_suppressed_by_default_on():
+    """A core canary TP (HIGH/HIGH, has a trace) blocks under the default policy.
+
+    This is the suppression-rate regression assertion: a default-on change must
+    not raise the suppression rate of canary TPs. Since the default policy is
+    unchanged (no default-on suppression), the canary keeps blocking.
+    """
+    canary = _finding(finding_id="v_canary_tp", code_flow_count=3)
+    assert evaluate_vulnerability_gate_policy([canary]).blocking_count == 1
diff --git a/tests/test_vulnerability_synthetic_regression_gate.py b/tests/test_vulnerability_synthetic_regression_gate.py
new file mode 100644
index 0000000..ed36f66
--- /dev/null
+++ b/tests/test_vulnerability_synthetic_regression_gate.py
@@ -0,0 +1,127 @@
+"""M3 synthetic regression gate — ENFORCE (recall>=0.99 / precision>=0.90).
+
+This is the CI-enforced regression guard for the autonomous vuln-parity goal.
+``uv run pytest`` (CI job ``ci/pytest``) runs these, so the gate is enforced as a
+real test, not a report-only artifact:
+
+* the GREEN guard loads the committed 5-class synthetic corpus, runs the M2
+  normalization-aware evaluation, and asserts the default
+  :class:`VulnerabilityEvaluationThresholds` gate (recall>=0.99, precision>=0.90)
+  PASSES. A regression that drops a true positive or adds a false positive turns
+  this red.
+* the RED canary proves the enforce is NOT vacuous: dropping one actual true
+  positive from the corpus makes the SAME gate FAIL (recall falls below 0.99), so
+  we know the gate would actually catch a real recall regression.
+
+Computation reuse: this exercises ONLY M2's
+:func:`evaluate_vulnerability_findings_normalized` +
+:func:`evaluate_vulnerability_gate` over the committed corpus. There is no new
+precision/recall code here.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from security_scanner.core.vulnerability.codescan import RuleClassNormalizer
+from security_scanner.core.vulnerability.evaluation import (
+    VulnerabilityEvaluationThresholds,
+    evaluate_vulnerability_findings_normalized,
+    evaluate_vulnerability_gate,
+    load_vulnerability_corpus_normalized,
+)
+
+CORPUS = (
+    Path(__file__).resolve().parents[1]
+    / "eval"
+    / "synthetic-code-vuln"
+    / "corpus-snapshot.json"
+)
+
+
+def test_synthetic_regression_gate_enforces_recall_and_precision_slo():
+    """GREEN: the committed corpus passes the default enforce gate.
+
+    recall>=0.99 and precision>=0.90 over the normalization-aware path. This is
+    the regression guard CI enforces via ``uv run pytest``.
+    """
+    expected, actual = load_vulnerability_corpus_normalized(CORPUS)
+    result = evaluate_vulnerability_findings_normalized(
+        expected, actual, normalizer=RuleClassNormalizer()
+    )
+
+    gate = evaluate_vulnerability_gate(result, VulnerabilityEvaluationThresholds())
+
+    assert gate.passed, gate.reason
+    assert result.recall >= 0.99
+    assert result.precision >= 0.90
+    assert result.false_negative_count == 0
+
+
+def test_synthetic_regression_gate_is_not_vacuous_red_canary():
+    """RED canary: drop one actual TP -> the SAME gate FAILS (recall regression).
+
+    Proves the enforce gate above is real. If a future change silently dropped a
+    detector finding (or weakened the matcher), recall would fall below 0.99 and
+    the gate would block — exactly what this canary demonstrates by construction.
+    """
+    expected, actual = load_vulnerability_corpus_normalized(CORPUS)
+    assert len(actual) >= 1
+
+    # Simulate a regression: one true-positive finding is no longer emitted.
+    regressed_actual = actual[:-1]
+
+    result = evaluate_vulnerability_findings_normalized(
+        expected, regressed_actual, normalizer=RuleClassNormalizer()
+    )
+
+    gate = evaluate_vulnerability_gate(result, VulnerabilityEvaluationThresholds())
+
+    assert gate.passed is False
+    assert result.recall < 0.99
+    assert result.false_negative_count >= 1
+
+
+def test_synthetic_regression_gate_catches_false_positive_precision_regression():
+    """RED canary (precision): an extra unmatched finding drops precision < 0.90.
+
+    Complements the recall canary: a single spurious finding that matches no
+    expected class is a false positive, and with five expected TPs one extra FP
+    takes precision to 5/6 ~= 0.833 < 0.90, so the gate blocks.
+    """
+    from security_scanner.core.vulnerability.model import (
+        VulnerabilityFinding,
+        VulnerabilityLocation,
+        compute_vulnerability_finding_id,
+    )
+
+    expected, actual = load_vulnerability_corpus_normalized(CORPUS)
+
+    spurious = VulnerabilityFinding(
+        finding_id=compute_vulnerability_finding_id(
+            source_tool="semgrep",
+            rule_id="py/sql-injection",
+            partial_fingerprints=None,
+            file_path="synthetic_app/spurious.py",
+            line_start=999,
+            message="synthetic finding",
+        ),
+        rule_id="py/sql-injection",
+        message="synthetic finding",
+        primary_location=VulnerabilityLocation(
+            file_path="synthetic_app/spurious.py",
+            line_start=999,
+        ),
+        source_tool="semgrep",
+        cwe_ids=("CWE-89",),
+    )
+
+    result = evaluate_vulnerability_findings_normalized(
+        expected, [*actual, spurious], normalizer=RuleClassNormalizer()
+    )
+
+    gate = evaluate_vulnerability_gate(result, VulnerabilityEvaluationThresholds())
+
+    assert gate.passed is False
+    assert result.precision < 0.90
+    assert result.false_positive_count >= 1