feat(vuln): GHAS급 vuln/SAST 품질 — code-scanning parity substrate + 인라인 티어 + report-only SLO 게이트 (자율 M1~M3)#59
Conversation
…l/current.yml/CURRENT.md 원자 동기
- docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design(v2),review}.md 승격
- docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md (실행 패킷)
- governance: goal_id + active_goal + CURRENT.md를 ghas-quality-vuln-parity로 원자 갱신
(governance/** 광역 금지·vuln_parity_slo.py만, M3 durable→H-track, stop_conditions 정본 16)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
… 매처 + 적대 fixture GHAS code-scanning(CodeQL) alert을 redacted CodeScanAlertRecord로 표현하고 우리 VulnerabilityFinding과 매칭하는 순수-로직 측정 substrate. 시크릿 트랙 parity.py의 검증된 구조를 vuln 도메인으로 1:1 전이. - CodeScanAlertRecord: redacted leaf value object(core/vulnerability/, store 결합 0). - RuleClassNormalizer: CWE 브리지(by-cwe) > rule-token 정확 집합 일치(by-rule-token) > unmatched 3등급. 부분 겹침 금지(path-traversal != open-redirect). - compare_codescan_alerts_with_findings: |alert-finding|<=N(N=2) 진짜 line-window, 1:1 greedy(_AlertSlot.consumed), state-aware 분모(recall=open+fixed만, precision 페널티=dismissed fp/used-in-tests 별도, won't-fix는 TP-비차단 제외). precision/recall은 core/vulnerability/evaluation.py 재사용(신규 계산 0줄) — canonical key 합성 후 evaluate_vulnerability_findings로 수렴. - load_codescan_snapshot: source: synthetic fail-closed(실 snapshot 차단). - 적대 fixture(eval/codescan-parity-corpus): CWE-부재 rule-token-only, 라인 드리프트, CodeQL<->Semgrep 다른 rule.id, dismissed_reason 케이스. 정규화/윈도/필터 누락 시 red. - cwe_deficit_rate / rule_token_rescue_rate 메타 노출. 네트워크 0. 16 adversarial tests green, full suite 무회귀. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
design §K default-on/gated 경계 — 기존 scan-vuln gate default 출력 byte-identical 불변을 #1 인변으로. 인라인 FP-억제 티어는 gate 레이어에만(emission 무접촉), 신규 억제는 전부 opt-in/default-off. - gate.py: VulnerabilityGateThresholds에 require_trace(code_flow_count==0=무 reachability 비차단) + suppress_rules(canonical vuln-class 억제) 두 플래그 추가, 둘 다 default OFF. base_blocking(기존 severity x precision) 보존, 신규 분기는 blocking set에서 제거만. default 시 blocking set + reason string 불변(canary 테스트로 고정). suppress_rules는 M1 RuleClassNormalizer 재사용(중복 정규화 0). - evaluation.py: 정규화 인지 평가 경로(evaluate_vulnerability_findings_normalized, load_vulnerability_corpus_normalized, NormalizedExpectedFinding) 신규 추가. M1과 동일 RuleClassNormalizer + line-window=2 fuzzy 1:1 greedy 후 기존 evaluate_vulnerability_findings 재사용(신규 precision/recall 0줄). 기존 exact-key Counter 경로 byte-identical 불변 (VFR8 정합은 새 경로가 정규화 공유로 달성, naive 키 미변경). - eval/synthetic-code-vuln/corpus-snapshot.json: SQLi/XSS/path-traversal/command-injection/ SSRF 5종(CWE-89/79/22/78/918) 취약+안전 케이스, source: synthetic fail-closed, cross-dialect rule.id 정규화 매칭. out-of-rule(CWE-502) recall<1 의도 케이스 문서화. 기존 scan-vuln/gate/evaluate/report default 불변. emission/store/governance 무접촉. 신규 22 tests green, 기존 CLI canary green, full suite 1291 passed 무회귀. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
… done) 자율 goal 종착점. 합성 회귀 게이트 enforce + governance/vuln_parity_slo.py report-only 배선을 synthetic fixture로 결정적 재현. 시크릿 governance/parity_slo.py 1:1 전이(별도 파일). - governance/vuln_parity_slo.py(allowed_writes 유일 governance 신규 파일): frozen synthetic code-scanning snapshot 대비 macro parity 측정. threshold yml 부재/빈값 -> report-only(항상 exit 0), 존재 -> enforce(H-track baseline 후). 나이>임계 -> stale-degraded(report-only 경고/enforce 하드 실패, fetched_at 미파싱=stale). per-repo precision/recall은 M1 매처(.detection) 재사용(신규 계산 0줄), aggregate는 averaging만. load_codescan_snapshot fail-closed로 실 export 차단. staleness 항상 노출. 공개안전 aggregate-only 리포트. --check/--json/--max-age-days 등 시크릿 게이트와 동형. - codescan_parity.py: MacroCodeScanParityResult + aggregate_codescan_parity 추가 (시크릿 aggregate_repo_parity 평행, macro 평균 + tier/dismissed-fp/cwe-deficit 합). - 합성 회귀 게이트 enforce: test_vulnerability_synthetic_regression_gate.py가 5종 코퍼스 정규화 평가로 recall>=0.99/precision>=0.90 PASS를 ci/pytest로 enforce. red canary 2종 (TP 드롭->recall<0.99 FAIL, spurious FP->precision<0.90 FAIL)으로 비공허 증명. - dismissed_fp_hit(precision 페널티)/won't-fix(recall 제외) 리포트 surfacing 행사. threshold yml 미생성(부재=report-only가 자율 상태). 실 parity SLO 도달은 H-track 후. 네트워크 0. 19 신규 tests green, full suite 1310 passed 무회귀. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
There was a problem hiding this comment.
Code Review
This pull request implements the autonomous phase (M1–M3) of the GHAS code-scanning parity SLO goal (ghas-quality-vuln-parity). It introduces a redacted code-scanning alert domain model, a rule-class normalizer, a fuzzy line-window parity matcher, and an inline cheap FP-suppression tier, alongside synthetic adversarial corpus fixtures and comprehensive test suites. The review feedback focuses on enhancing robustness during JSON parsing and timestamp processing by adding defensive type validation to prevent potential TypeError or AttributeError when encountering unexpected input types.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def extract_cwe_ids(values: Iterable[str]) -> tuple[str, ...]: | ||
| """Normalize arbitrary CWE-bearing tokens into ``CWE-NNN`` ids (sorted, unique).""" | ||
| found: set[str] = set() | ||
| for value in values: | ||
| match = _CWE_RE.search(str(value)) | ||
| if match: | ||
| found.add(f"CWE-{int(match.group(1))}") | ||
| return tuple(sorted(found)) |
There was a problem hiding this comment.
The extract_cwe_ids function processes values which is parsed from JSON fields (e.g., item.get("cweIds", [])). If cweIds is explicitly null or a non-iterable type in the JSON, this function will raise a TypeError. We should defensively validate that values is a list, tuple, or set before iterating over it, and handle non-string elements gracefully. Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
| def extract_cwe_ids(values: Iterable[str]) -> tuple[str, ...]: | |
| """Normalize arbitrary CWE-bearing tokens into ``CWE-NNN`` ids (sorted, unique).""" | |
| found: set[str] = set() | |
| for value in values: | |
| match = _CWE_RE.search(str(value)) | |
| if match: | |
| found.add(f"CWE-{int(match.group(1))}") | |
| return tuple(sorted(found)) | |
| def extract_cwe_ids(values: Iterable[str] | None) -> tuple[str, ...]: | |
| """Normalize arbitrary CWE-bearing tokens into CWE-NNN ids (sorted, unique).""" | |
| if not isinstance(values, (list, tuple, set)): | |
| return () | |
| found: set[str] = set() | |
| for value in values: | |
| if value is None: | |
| continue | |
| match = _CWE_RE.search(str(value)) | |
| if match: | |
| found.add(f"CWE-{int(match.group(1))}") | |
| return tuple(sorted(found)) |
References
- When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
| data = json.loads(Path(path).read_text(encoding="utf-8")) | ||
| source = str(data.get("source", "")).strip().lower() |
There was a problem hiding this comment.
If the parsed JSON data is not a dictionary (e.g., a list or primitive), calling data.get will raise an AttributeError. We should defensively validate that data is a dictionary before calling .get(). Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
data = json.loads(Path(path).read_text(encoding="utf-8"))
if not isinstance(data, dict):
raise ValueError("Parsed JSON is not a dictionary")
source = str(data.get("source", "")).strip().lower()References
- When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
| data = json.loads(Path(path).read_text(encoding="utf-8")) | ||
| source = str(data.get("source", "")).strip().lower() |
There was a problem hiding this comment.
If the parsed JSON data is not a dictionary (e.g., a list or primitive), calling data.get will raise an AttributeError. We should defensively validate that data is a dictionary before calling .get(). Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
data = json.loads(Path(path).read_text(encoding="utf-8"))
if not isinstance(data, dict):
raise ValueError("Parsed JSON is not a dictionary")
source = str(data.get("source", "")).strip().lower()References
- When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
| def _parse_timestamp(value: str) -> dt.datetime | None: | ||
| text = value.strip() |
There was a problem hiding this comment.
If value is not a string (e.g., an integer, boolean, or None from parsed JSON), calling value.strip() will raise an AttributeError. We should defensively validate that value is a string before processing it. Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
| def _parse_timestamp(value: str) -> dt.datetime | None: | |
| text = value.strip() | |
| def _parse_timestamp(value: Any) -> dt.datetime | None: | |
| if not isinstance(value, str): | |
| return None | |
| text = value.strip() |
References
- When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
요약
ghas-quality-vuln-paritylong-single-goal의 자율층 M1~M3을 TDD red-green으로 완성. vuln/SAST 탐지를 GHAS code-scanning(CodeQL) parity SLO에 맞추는 측정 harness + FP-억제 품질 머신을 synthetic fixture만으로 구축. 시크릿 서브트랙(PR #58)의 검증된 2층 구조를 1:1 전이하되, durable disposition은 vuln 고유 storage-projection 제약으로 H-track(H4)으로 분리.자율 goal done = M3. 실 GHAS 무접촉, 커밋 산출물은 synthetic redacted only.
마일스톤 (전부 architecture gate 통과: pre/post-M2/post-M3/final, blocking 0)
305de47):CodeScanAlertRecord(redacted leaf, store 결합 0) +compare_codescan_alerts_with_findings매처. CWE 3등급(by-cwe > by-rule-token > unmatched), 진짜|alert−finding|≤N(N=2) line-window, 1:1 greedy, state-aware 분모(recall=open+fixed만, precision 페널티=dismissed fp/used-in-tests 별도, won't-fix는 TP-비차단 제외). precision/recall은core/vulnerability/evaluation.py재사용(신규 계산 0줄) — canonical key 합성 후evaluate_vulnerability_findings로 수렴. 적대 fixture(CWE-부재/라인드리프트/CodeQL↔Semgrep 다른 rule.id/dismissed)가 정규화·윈도·필터 누락 시 red.c1716e8): 인라인 싼 티어(gate.py—require_trace·suppress_rules두 플래그 default-off gated, base_blocking 보존, blocking set에서 제거만). 기존 scan-vuln/gate default 출력 byte-identical 불변(canary 테스트 고정, emission 무접촉). 합성 코퍼스 5종(SQLi/XSS/path-traversal/command-injection/SSRF, CWE-89/79/22/78/918). 정규화 인지 평가 경로 신규 추가(기존 exact-key 경로 불변, M1과 동일 정규화+line-window 공유 = VFR8 정합).75550d0, 자율 done): 합성 회귀 게이트 enforce(recall≥0.99/precision≥0.90, ci/pytest로 enforce + red canary 2종으로 비공허 증명) +governance/vuln_parity_slo.py(시크릿parity_slo.py1:1 전이, 별도 파일). threshold yml 부재→report-only(항상 exit 0), 존재→enforce(H-track 후), 나이>임계→stale-degraded(silent staleness 금지). frozen synthetic snapshot 대비 결정적 재현, dismissed_fp_hit/won't-fix surfacing 행사.거버넌스/슬롯 처리 (시크릿 트랙 검증 패턴)
0d779f9): goal_id + active_goal + CURRENT.md 동기 + spec 승격 + goal 패킷. activation 베이스라인이지 자율 변경 아님.--base 0d779f9로 M1~M3 변경만 allowed_writes 대비 평가(green)..github/workflows/ci.yml에서ci/autopilot-gate가claude/*면제(skip→Success).governance/**광역 미수정, allowed_writes의 governance는governance/vuln_parity_slo.py단일.H-track (자율 루프 밖, stop-condition 후속 — 이 PR 범위 밖)
ghas-live-fetch-or-mutation-requiredhuman-PR.storage-projection-or-schema-migration-required(VulnerabilityFinding이 durable store 미적재,set_finding_disposition은 FINDING_STATE 부재 시 ValueError). v1 자율에선 기존 throwaway JSONL 동작 유지.검증
uv run pytest: 1310 passed, 4 skipped(기존 live-run guard, 무관). 무회귀(baseline 1269 → 1310, +41).vuln_parity_slo --check: exit 0(report-only).autopilot_gate --base 0d779f9: exit 0.public_safety --diff origin/main...HEAD및--path docs/workbench/specs/ghas-quality-vuln-subtrack: exit 0.render --validate/--check,rebuild_ledger_index --check,render_github_ruleset --check: 전부 통과.source: syntheticfail-closed).🤖 Generated with Claude Code
https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh