source-security-dev · pureliture · Jun 21, 2026 · Jun 21, 2026 · Jun 21, 2026 · Jun 21, 2026
diff --git a/CURRENT.md b/CURRENT.md
@@ -4,7 +4,7 @@
 
 - Project: `security-scanner`
 - Merge mode: `guarded-auto-merge`
-- Active goal: `personal-prod-deploy`
+- Active goal: `ghas-quality-vuln-parity`
 - Last auto merge: `ledger:20260617T003405Z-autopilot-3236f4`
 - Ledger entries: `4`
 - Ledger index hash: `sha256:e1893a649a1101b74a087b5eaaa275813a85708c5bb46c4ae70c24e10a111050`

diff --git a/docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md b/docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
@@ -0,0 +1,157 @@
+# Agentic Workflow: GHAS급 vuln/SAST 탐지 품질 (CodeQL parity SLO)
+
+**Status:** Ready for long single-goal execution
+**Date:** 2026-06-21
+**Goal ID:** `ghas-quality-vuln-parity`
+**Spec:** `docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design,review}.md`
+**Merge flow:** pull request
+
+장시간 단일 goal 실행 패킷. vuln/SAST 탐지를 **GHAS code-scanning(CodeQL) parity SLO**에 맞추는 측정
+harness + FP-억제 품질 머신을 구축한다. 시크릿 서브트랙(PR #58)의 검증된 2층 구조를 1:1 전이하되,
+**vuln 고유로 durable disposition을 자율층에서 빼 H-track으로** 옮긴다(VulnerabilityFinding이 durable
+store에 미적재 + `set_finding_disposition`이 FINDING_STATE 부재 시 ValueError → storage-projection
+stop-condition). 실 code-scanning live-fetch는 stop-condition, 커밋은 synthetic-or-redacted-only.
+
+## Goal
+
+vuln/SAST의 per-repo 1:1 CodeQL parity 측정 harness + 인라인 FP-억제 티어 + 합성 회귀 게이트 enforce +
+report-only parity 게이트 배선을 synthetic fixture로 TDD 완성하고 PR/CI/merge까지 닫는다.
+
+**완료 기준(자율 goal done = M3):**
+
+- **M1**: code-scanning 도메인 모델 `CodeScanAlertRecord`(redacted) + 매처
+  `compare_codescan_alerts_with_findings`(CWE-교집합 3등급: matched-by-cwe/by-rule-token/unmatched) +
+  적대적 fixture. **precision/recall은 `core/vulnerability/evaluation.py` 재사용(신규 계산 코드 0줄)**,
+  `CodeScanAlertRecord→VulnerabilityEvaluationKey` 어댑터로만. line-window는 진짜 `|alert−finding|≤N`,
+  recall 분모=open+fixed alert만·precision 페널티=dismissed 별도. 네트워크 0.
+- **M2**: 인라인 싼 티어(scan-vuln 후처리: code_flow_count·severity floor·저신뢰 rule 억제) — 결정적·
+  메타데이터-only·억제율 회귀로 보장되는 부분만 default-on, 동작 바꾸는 신규 억제는 gated. 합성 코퍼스를
+  SQLi/XSS/path-traversal/command-injection/SSRF 5종으로 확장 + rule-class 정규화 적용. 기존 scan-vuln
+  default 출력 불변(canary TP 보존).
+- **M3(자율 done)**: 합성 회귀 게이트 enforce(evaluate precision≥0.90/recall≥0.99) + report-only parity
+  게이트 `governance.vuln_parity_slo --check`(threshold yml 부재→report-only, frozen synthetic snapshot
+  대비, 나이>임계→stale-degraded) 배선. 실 snapshot 없이 결정적 재현 증명.
+- 기존 Gitleaks-first secret + 기존 vuln scan/import/report/gate default path 불변.
+- GHAS trigger/upload/alert mutation/**live-fetch 없음**. Architecture review(pre/post-M2/post-M3/final)
+  blocking 0. PR CI + local governance gate 통과.
+
+**H-track(자율 루프 밖, stop-condition PR):** H1 실 code-scanning snapshot 취득 → H2 baseline + fixture-
+vs-real divergence → H3 목표 확정 + parity enforce → **H4 vuln verdict durable disposition 배선(storage
+projection)**.
+
+## Execution Contract
+
+- 단일 장기 goal로 M1~M3을 끝까지. 중간 승인 없음. 사람 개입은 stop-condition 시에만.
+- Subagent 적극 사용(구현 worker gpt-5.5/high; 보조는 repo policy). PR 만들고 CI 통과 후 merge 가능까지.
+- 실 endpoint/host/credential/private path/real SARIF/real code-scanning export/real finding 커밋 금지.
+
+## Fixed Decisions
+
+- Scope: vuln 자율 M1~M3(synthetic-only). 실 fetch·baseline·enforce·durable disposition은 H-track.
+- 측정: CodeQL code-scanning alert oracle, per-repo 1:1, snapshot=ground-truth(frozen synthetic). 계산은
+  `core/vulnerability/evaluation.py` 재사용(제4 엔진 신설 금지). 합성 evaluate와 parity 매처 같은 계산 코어.
+- 매칭: rule-class 정규화 + line-window를 합성 게이트·parity 둘 다 동일 의미론 적용(VFR8 정합).
+- 인라인 티어: 결정적·메타데이터-only 부분 default-on, 동작 변경분 gated. validity-check 아날로그 없음.
+- **durable disposition 금지(자율)**: vuln verdict는 v1 자율에서 기존 throwaway JSONL 유지. durable
+  영속은 storage projection 필요 → `storage-projection-or-schema-migration-required` stop → H4.
+- snapshot: synthetic redacted fixture만 커밋(`source: synthetic` marker 필수, 없으면 fail-closed). 실
+  snapshot은 `.gitignore` + allowed_writes 비포함 이중 차단.
+- **governance 핵심 자율수정 금지**: allowed_writes는 `governance/vuln_parity_slo.py`만(시크릿
+  `governance/parity_slo.py`와 별도 파일). `autopilot_goal.yml`·`autopilot_gate.py`·`public_safety.py`
+  수정 필요 시 stop(scope-expansion) → 사람 PR.
+- 슬롯: 자율 코드는 active_goal 슬롯 없이 머지(머지 시 governance 3파일 main(theirs) 채택). 실 슬롯 전환은 사용자 결정.
+
+## Required Architecture Review Gate
+
+Mandatory blocking. pre-implementation / post-M2 / post-M3 / final. SoT change·scope expansion·unsafe
+data·기존 default 변경 요구 시만 정지; 그 외 in-goal 수정.
+
+## Multi-agent Execution Model
+
+Subagent를 disjoint 책임으로(매처/모델 Worker A, 인라인 티어 Worker B, 합성 게이트+parity_slo Worker C,
+architecture/security reviewer read-only, code_simplifier). Main agent 통합·최종 판단.
+
+## Allowed Write Surface
+
+`governance/autopilot_goal.yml`의 `allowed_writes`가 authoritative. 요약: 승격 spec, 이 workflow 문서,
+src/tests/eval/examples, `governance/vuln_parity_slo.py`(신규 게이트만), ledger, CURRENT.md. **`governance/**`
+광역 아님** — 그 밖 governance 변경은 scope expansion 정지.
+
+## Suggested Work Plan
+
+### Readiness (M0 = goal-setup, 이미 orchestrator가 수행)
+goal-setup(spec 승격 + autopilot_goal.yml goal_id + current.yml active_goal + CURRENT.md 원자 커밋)은
+orchestrator가 완료. 너는 pre-implementation architecture review부터 시작.
+
+### M1 측정 substrate
+1. red-first: 매처 CWE/rule-token/line-window/dismissed 채점; 적대적 fixture(CWE-부재/라인드리프트/
+   CodeQL↔Semgrep 다른 rule.id/dismissed)에서 정규화·윈도·필터 누락이 red; precision/recall이
+   `core/vulnerability/evaluation.py`에서 산출; 분모 state-aware.
+2. 구현: CodeScanAlertRecord, 어댑터, 매처(신규 precision/recall 계산 0줄). line-window N fixture 확정.
+
+### M2 인라인 티어 + 합성 강화
+1. red-first: 안전 코드 FP 억제 + 취약 recall 유지(evaluate gate), default-on이 recall≥0.99 안 깸,
+   기존 default 출력 불변, 독립 적대 쌍 회귀.
+2. 구현: 인라인 gating(default-on/gated 경계), 합성 코퍼스 5종 + rule-class 정규화. post-M2 review.
+
+### M3 합성 게이트 + parity_slo (자율 done)
+1. red-first: 합성 회귀 게이트 enforce; `governance/vuln_parity_slo.py` report-only(threshold 부재)·
+   frozen synthetic snapshot 대비·stale-degraded.
+2. 구현: vuln_parity_slo.py. final review → PR. CURRENT.md에 "parity SLO enforce 미달성, H-track 대기".
+
+## Required Local Checks
+
+```bash
+uv run pytest
+uv run python -m governance.render --validate
+uv run python -m governance.render --check
+uv run python -m governance.rebuild_ledger_index --check
+uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check
+uv run python -m governance.public_safety --diff origin/main...HEAD
+uv run python -m governance.public_safety --path docs/workbench/specs/ghas-quality-vuln-subtrack
+uv run python -m governance.vuln_parity_slo --check
+uv run python -m governance.autopilot_gate --base origin/main
+```
+
+## Stop Conditions
+
+`governance/autopilot_goal.yml`의 `stop_conditions`(정본 16). 핵심: ghas-live-fetch-or-mutation-required
+(H1 실 fetch), **storage-projection-or-schema-migration-required**(durable disposition·snapshot durable →
+H4), existing-secret-default-behavior-change, architecture-review-blocking-finding, public-safety-hit,
+scope-expansion(governance 핵심 파일 수정), same-blocker-three-times, break-glass.
+
+## Resume Prompt
+
+```text
+Goal: complete `ghas-quality-vuln-parity` in the security-scanner repo through a PR.
+
+Read first:
+- AGENTS.md
+- governance/autopilot_goal.yml
+- docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md
+- docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design,review}.md
+- src/security_scanner/core/vulnerability/{evaluation,model}.py
+- src/security_scanner/baseline/ghas_api/__init__.py
+- src/security_scanner/runtime/vulnerability_verify_artifact.py
+- src/security_scanner/cli/commands (import-sarif/scan-vuln/report/gate/evaluate)
+
+Implement M1~M3 (autonomous, synthetic fixtures only, no real GHAS/code-scanning):
+M1 CodeScanAlertRecord + compare_codescan_alerts_with_findings matcher (CWE 3-tier) + adversarial
+   fixtures. Reuse core/vulnerability/evaluation.py (zero new precision/recall code). True |line|<=N
+   window, state-aware denominators.
+M2 inline cheap tier (metadata-only default-on / gated for behavior change), synthetic corpus 5 CWE
+   classes + rule-class normalization. Existing scan-vuln default output unchanged.
+M3 synthetic regression gate enforce + report-only parity gate governance.vuln_parity_slo --check.
+
+Do NOT: durable-persist vuln verdict (storage projection -> H4 human-gated), call/fetch GHAS code-
+scanning, commit real SARIF/findings, modify governance/autopilot_goal.yml | autopilot_gate.py |
+public_safety.py (allowed_writes = governance/vuln_parity_slo.py only), change existing secret/vuln
+scan defaults. Real snapshot fetch, baseline, enforce, durable disposition are human-gated H1~H4.
+Use multi-agent. Mandatory architecture gates: pre-implementation, post-M2, post-M3, final. Finish
+by opening a PR, waiting for CI, merge when green. Autonomous done = M3; record "parity SLO enforce
+pending H-track" in CURRENT.md.
+
+Required checks: pytest; render --validate/--check; rebuild_ledger_index --check;
+render_github_ruleset --check; public_safety --diff and --path docs/workbench/specs/ghas-quality-vuln-
+subtrack; vuln_parity_slo --check; autopilot_gate --base origin/main.
+```