Skip to content

feat(vuln): GHAS급 vuln/SAST 품질 — code-scanning parity substrate + 인라인 티어 + report-only SLO 게이트 (자율 M1~M3)#59

Merged
pureliture merged 4 commits into
mainfrom
claude/ghas-quality-vuln
Jun 21, 2026
Merged

feat(vuln): GHAS급 vuln/SAST 품질 — code-scanning parity substrate + 인라인 티어 + report-only SLO 게이트 (자율 M1~M3)#59
pureliture merged 4 commits into
mainfrom
claude/ghas-quality-vuln

Conversation

@pureliture

Copy link
Copy Markdown
Contributor

요약

ghas-quality-vuln-parity long-single-goal의 자율층 M1~M3을 TDD red-green으로 완성. vuln/SAST 탐지를 GHAS code-scanning(CodeQL) parity SLO에 맞추는 측정 harness + FP-억제 품질 머신을 synthetic fixture만으로 구축. 시크릿 서브트랙(PR #58)의 검증된 2층 구조를 1:1 전이하되, durable disposition은 vuln 고유 storage-projection 제약으로 H-track(H4)으로 분리.

자율 goal done = M3. 실 GHAS 무접촉, 커밋 산출물은 synthetic redacted only.

마일스톤 (전부 architecture gate 통과: pre/post-M2/post-M3/final, blocking 0)

  • M1 (305de47): CodeScanAlertRecord(redacted leaf, store 결합 0) + compare_codescan_alerts_with_findings 매처. CWE 3등급(by-cwe > by-rule-token > unmatched), 진짜 |alert−finding|≤N(N=2) line-window, 1:1 greedy, state-aware 분모(recall=open+fixed만, precision 페널티=dismissed fp/used-in-tests 별도, won't-fix는 TP-비차단 제외). precision/recall은 core/vulnerability/evaluation.py 재사용(신규 계산 0줄) — canonical key 합성 후 evaluate_vulnerability_findings로 수렴. 적대 fixture(CWE-부재/라인드리프트/CodeQL↔Semgrep 다른 rule.id/dismissed)가 정규화·윈도·필터 누락 시 red.
  • M2 (c1716e8): 인라인 싼 티어(gate.pyrequire_trace·suppress_rules 두 플래그 default-off gated, base_blocking 보존, blocking set에서 제거만). 기존 scan-vuln/gate default 출력 byte-identical 불변(canary 테스트 고정, emission 무접촉). 합성 코퍼스 5종(SQLi/XSS/path-traversal/command-injection/SSRF, CWE-89/79/22/78/918). 정규화 인지 평가 경로 신규 추가(기존 exact-key 경로 불변, M1과 동일 정규화+line-window 공유 = VFR8 정합).
  • M3 (75550d0, 자율 done): 합성 회귀 게이트 enforce(recall≥0.99/precision≥0.90, ci/pytest로 enforce + red canary 2종으로 비공허 증명) + governance/vuln_parity_slo.py(시크릿 parity_slo.py 1:1 전이, 별도 파일). threshold yml 부재→report-only(항상 exit 0), 존재→enforce(H-track 후), 나이>임계→stale-degraded(silent staleness 금지). frozen synthetic snapshot 대비 결정적 재현, dismissed_fp_hit/won't-fix surfacing 행사.

거버넌스/슬롯 처리 (시크릿 트랙 검증 패턴)

  • goal-setup bundled (0d779f9): goal_id + active_goal + CURRENT.md 동기 + spec 승격 + goal 패킷. activation 베이스라인이지 자율 변경 아님.
  • autopilot-gate base = activation: acceptance check은 --base 0d779f9로 M1~M3 변경만 allowed_writes 대비 평가(green).
  • claude/ CI exempt*: .github/workflows/ci.yml에서 ci/autopilot-gateclaude/* 면제(skip→Success).
  • active_goal 슬롯 유지(personal-prod): 자율 코드는 active_goal 슬롯 없이 머지. governance 3파일(autopilot_goal.yml·current.yml·CURRENT.md) 미변경 — 실 슬롯 점유 전환은 사용자 결정. governance/** 광역 미수정, allowed_writes의 governance는 governance/vuln_parity_slo.py 단일.

H-track (자율 루프 밖, stop-condition 후속 — 이 PR 범위 밖)

  • H1 실 code-scanning(CodeQL) snapshot 취득 — ghas-live-fetch-or-mutation-required human-PR.
  • H2 parity baseline 측정 + fixture-vs-real divergence + 목표 확정(measure-first).
  • H3 parity enforce 전환(threshold yml 커밋).
  • H4 vuln verdict durable disposition 배선 — storage-projection-or-schema-migration-required(VulnerabilityFinding이 durable store 미적재, set_finding_disposition은 FINDING_STATE 부재 시 ValueError). v1 자율에선 기존 throwaway JSONL 동작 유지.

parity SLO enforce 미달성, H-track 대기. 실 parity SLO 도달(measure-first v1 done)은 H1~H4 완료 후. CURRENT.md는 생성 파일(render --check 게이트)이고 current.yml은 allowed_writes 밖이라 이 노트는 PR body에 기록.

검증

  • uv run pytest: 1310 passed, 4 skipped(기존 live-run guard, 무관). 무회귀(baseline 1269 → 1310, +41).
  • vuln_parity_slo --check: exit 0(report-only). autopilot_gate --base 0d779f9: exit 0. public_safety --diff origin/main...HEAD--path docs/workbench/specs/ghas-quality-vuln-subtrack: exit 0. render --validate/--check, rebuild_ledger_index --check, render_github_ruleset --check: 전부 통과.
  • 네트워크 0, 커밋 산출물 synthetic-or-redacted-only(source: synthetic fail-closed).

🤖 Generated with Claude Code

https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh

pureliture and others added 4 commits June 21, 2026 17:59
…l/current.yml/CURRENT.md 원자 동기

- docs/workbench/specs/ghas-quality-vuln-subtrack/{requirements,design(v2),review}.md 승격
- docs/workbench/agentic-workflows/2026-06-21-ghas-quality-vuln-parity-goal.md (실행 패킷)
- governance: goal_id + active_goal + CURRENT.md를 ghas-quality-vuln-parity로 원자 갱신
  (governance/** 광역 금지·vuln_parity_slo.py만, M3 durable→H-track, stop_conditions 정본 16)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
… 매처 + 적대 fixture

GHAS code-scanning(CodeQL) alert을 redacted CodeScanAlertRecord로 표현하고
우리 VulnerabilityFinding과 매칭하는 순수-로직 측정 substrate. 시크릿 트랙
parity.py의 검증된 구조를 vuln 도메인으로 1:1 전이.

- CodeScanAlertRecord: redacted leaf value object(core/vulnerability/, store 결합 0).
- RuleClassNormalizer: CWE 브리지(by-cwe) > rule-token 정확 집합 일치(by-rule-token)
  > unmatched 3등급. 부분 겹침 금지(path-traversal != open-redirect).
- compare_codescan_alerts_with_findings: |alert-finding|<=N(N=2) 진짜 line-window,
  1:1 greedy(_AlertSlot.consumed), state-aware 분모(recall=open+fixed만,
  precision 페널티=dismissed fp/used-in-tests 별도, won't-fix는 TP-비차단 제외).
  precision/recall은 core/vulnerability/evaluation.py 재사용(신규 계산 0줄) —
  canonical key 합성 후 evaluate_vulnerability_findings로 수렴.
- load_codescan_snapshot: source: synthetic fail-closed(실 snapshot 차단).
- 적대 fixture(eval/codescan-parity-corpus): CWE-부재 rule-token-only, 라인 드리프트,
  CodeQL<->Semgrep 다른 rule.id, dismissed_reason 케이스. 정규화/윈도/필터 누락 시 red.
- cwe_deficit_rate / rule_token_rescue_rate 메타 노출.

네트워크 0. 16 adversarial tests green, full suite 무회귀.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
design §K default-on/gated 경계 — 기존 scan-vuln gate default 출력 byte-identical
불변을 #1 인변으로. 인라인 FP-억제 티어는 gate 레이어에만(emission 무접촉),
신규 억제는 전부 opt-in/default-off.

- gate.py: VulnerabilityGateThresholds에 require_trace(code_flow_count==0=무 reachability
  비차단) + suppress_rules(canonical vuln-class 억제) 두 플래그 추가, 둘 다 default OFF.
  base_blocking(기존 severity x precision) 보존, 신규 분기는 blocking set에서 제거만.
  default 시 blocking set + reason string 불변(canary 테스트로 고정).
  suppress_rules는 M1 RuleClassNormalizer 재사용(중복 정규화 0).
- evaluation.py: 정규화 인지 평가 경로(evaluate_vulnerability_findings_normalized,
  load_vulnerability_corpus_normalized, NormalizedExpectedFinding) 신규 추가. M1과 동일
  RuleClassNormalizer + line-window=2 fuzzy 1:1 greedy 후 기존 evaluate_vulnerability_findings
  재사용(신규 precision/recall 0줄). 기존 exact-key Counter 경로 byte-identical 불변
  (VFR8 정합은 새 경로가 정규화 공유로 달성, naive 키 미변경).
- eval/synthetic-code-vuln/corpus-snapshot.json: SQLi/XSS/path-traversal/command-injection/
  SSRF 5종(CWE-89/79/22/78/918) 취약+안전 케이스, source: synthetic fail-closed,
  cross-dialect rule.id 정규화 매칭. out-of-rule(CWE-502) recall<1 의도 케이스 문서화.

기존 scan-vuln/gate/evaluate/report default 불변. emission/store/governance 무접촉.
신규 22 tests green, 기존 CLI canary green, full suite 1291 passed 무회귀.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh
… done)

자율 goal 종착점. 합성 회귀 게이트 enforce + governance/vuln_parity_slo.py
report-only 배선을 synthetic fixture로 결정적 재현. 시크릿 governance/parity_slo.py
1:1 전이(별도 파일).

- governance/vuln_parity_slo.py(allowed_writes 유일 governance 신규 파일): frozen
  synthetic code-scanning snapshot 대비 macro parity 측정. threshold yml 부재/빈값
  -> report-only(항상 exit 0), 존재 -> enforce(H-track baseline 후). 나이>임계 ->
  stale-degraded(report-only 경고/enforce 하드 실패, fetched_at 미파싱=stale).
  per-repo precision/recall은 M1 매처(.detection) 재사용(신규 계산 0줄), aggregate는
  averaging만. load_codescan_snapshot fail-closed로 실 export 차단. staleness 항상 노출.
  공개안전 aggregate-only 리포트. --check/--json/--max-age-days 등 시크릿 게이트와 동형.
- codescan_parity.py: MacroCodeScanParityResult + aggregate_codescan_parity 추가
  (시크릿 aggregate_repo_parity 평행, macro 평균 + tier/dismissed-fp/cwe-deficit 합).
- 합성 회귀 게이트 enforce: test_vulnerability_synthetic_regression_gate.py가 5종 코퍼스
  정규화 평가로 recall>=0.99/precision>=0.90 PASS를 ci/pytest로 enforce. red canary 2종
  (TP 드롭->recall<0.99 FAIL, spurious FP->precision<0.90 FAIL)으로 비공허 증명.
- dismissed_fp_hit(precision 페널티)/won't-fix(recall 제외) 리포트 surfacing 행사.

threshold yml 미생성(부재=report-only가 자율 상태). 실 parity SLO 도달은 H-track 후.
네트워크 0. 19 신규 tests green, full suite 1310 passed 무회귀.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwGs78e6Rb7P5BDe2ezQEh

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the autonomous phase (M1–M3) of the GHAS code-scanning parity SLO goal (ghas-quality-vuln-parity). It introduces a redacted code-scanning alert domain model, a rule-class normalizer, a fuzzy line-window parity matcher, and an inline cheap FP-suppression tier, alongside synthetic adversarial corpus fixtures and comprehensive test suites. The review feedback focuses on enhancing robustness during JSON parsing and timestamp processing by adding defensive type validation to prevent potential TypeError or AttributeError when encountering unexpected input types.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +173 to +180
def extract_cwe_ids(values: Iterable[str]) -> tuple[str, ...]:
"""Normalize arbitrary CWE-bearing tokens into ``CWE-NNN`` ids (sorted, unique)."""
found: set[str] = set()
for value in values:
match = _CWE_RE.search(str(value))
if match:
found.add(f"CWE-{int(match.group(1))}")
return tuple(sorted(found))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The extract_cwe_ids function processes values which is parsed from JSON fields (e.g., item.get("cweIds", [])). If cweIds is explicitly null or a non-iterable type in the JSON, this function will raise a TypeError. We should defensively validate that values is a list, tuple, or set before iterating over it, and handle non-string elements gracefully. Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

Suggested change
def extract_cwe_ids(values: Iterable[str]) -> tuple[str, ...]:
"""Normalize arbitrary CWE-bearing tokens into ``CWE-NNN`` ids (sorted, unique)."""
found: set[str] = set()
for value in values:
match = _CWE_RE.search(str(value))
if match:
found.add(f"CWE-{int(match.group(1))}")
return tuple(sorted(found))
def extract_cwe_ids(values: Iterable[str] | None) -> tuple[str, ...]:
"""Normalize arbitrary CWE-bearing tokens into CWE-NNN ids (sorted, unique)."""
if not isinstance(values, (list, tuple, set)):
return ()
found: set[str] = set()
for value in values:
if value is None:
continue
match = _CWE_RE.search(str(value))
if match:
found.add(f"CWE-{int(match.group(1))}")
return tuple(sorted(found))
References
  1. When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

Comment on lines +258 to +259
data = json.loads(Path(path).read_text(encoding="utf-8"))
source = str(data.get("source", "")).strip().lower()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the parsed JSON data is not a dictionary (e.g., a list or primitive), calling data.get will raise an AttributeError. We should defensively validate that data is a dictionary before calling .get(). Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

    data = json.loads(Path(path).read_text(encoding="utf-8"))
    if not isinstance(data, dict):
        raise ValueError("Parsed JSON is not a dictionary")
    source = str(data.get("source", "")).strip().lower()
References
  1. When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

Comment on lines +277 to +278
data = json.loads(Path(path).read_text(encoding="utf-8"))
source = str(data.get("source", "")).strip().lower()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the parsed JSON data is not a dictionary (e.g., a list or primitive), calling data.get will raise an AttributeError. We should defensively validate that data is a dictionary before calling .get(). Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

    data = json.loads(Path(path).read_text(encoding="utf-8"))
    if not isinstance(data, dict):
        raise ValueError("Parsed JSON is not a dictionary")
    source = str(data.get("source", "")).strip().lower()
References
  1. When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

Comment on lines +158 to +159
def _parse_timestamp(value: str) -> dt.datetime | None:
text = value.strip()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If value is not a string (e.g., an integer, boolean, or None from parsed JSON), calling value.strip() will raise an AttributeError. We should defensively validate that value is a string before processing it. Additionally, ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

Suggested change
def _parse_timestamp(value: str) -> dt.datetime | None:
text = value.strip()
def _parse_timestamp(value: Any) -> dt.datetime | None:
if not isinstance(value, str):
return None
text = value.strip()
References
  1. When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.

@pureliture pureliture merged commit 5547c6c into main Jun 21, 2026
9 checks passed
@pureliture pureliture deleted the claude/ghas-quality-vuln branch June 21, 2026 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant