diff --git a/CURRENT.md b/CURRENT.md index badec4e..66ded2f 100644 --- a/CURRENT.md +++ b/CURRENT.md @@ -4,7 +4,7 @@ - Project: `security-scanner` - Merge mode: `guarded-auto-merge` -- Active goal: `review-assisted-autopilot` +- Active goal: `phase-2a-sarif-product-complete` - Last auto merge: `ledger:20260617T003405Z-autopilot-3236f4` - Ledger entries: `4` - Ledger index hash: `sha256:e1893a649a1101b74a087b5eaaa275813a85708c5bb46c4ae70c24e10a111050` diff --git a/docs/views/research-and-technical-decisions.md b/docs/views/research-and-technical-decisions.md index cbab32c..56b40e1 100644 --- a/docs/views/research-and-technical-decisions.md +++ b/docs/views/research-and-technical-decisions.md @@ -14,6 +14,46 @@ | LLM verifier | Ollama-compatible adapter | Detector가 아니라 triage 보조층으로 제한하기 위해 | | SAST/SCA expansion | 별도 확장 후보 | Secret Detection 실행 경로가 안정화되기 전 범위 확장을 막기 위해 | +## Phase 2a planning note + +SAST는 현재 기본 지원 기능이 아니라 opt-in 확장입니다. GHAS-like vulnerability +scanning 리서치 후속으로 [Phase 2a SARIF-native SAST spec](../workbench/specs/phase-2a-sarif-native-sast/requirements.md)을 +workbench에 기록했고, 기존 M1 import-first packet을 M1~M4 제품완성용 +long-single-goal로 승격했습니다. + +핵심 경계는 다음과 같습니다. + +- 기존 Gitleaks-first Secret Detection 기본 경로와 `Finding` 모델은 유지합니다. +- Code vulnerability alert는 별도 `VULN_FINDING` / `VulnerabilityFinding` + 계열로 검토합니다. +- Analyzer보다 SARIF-compatible contract를 먼저 고정하고 모든 analyzer output은 canonical importer를 통과시킵니다. +- 첫 실행 adapter는 Semgrep CE-compatible CLI를 기본 target으로 두되, Semgrep-compatible boundary로 유지합니다. +- `report`, `gate`, `evaluate`는 `--category code-vuln` 같은 explicit opt-in으로만 확장합니다. +- GHAS는 scan trigger나 alert mutation 없이 reference/read-only comparison 후보로만 둡니다. +- LLM은 detector가 아니라 verifier, explainer, generic remediation assistant입니다. +- Architecture review gate는 pre-implementation, post-M2, post-M3, final 단계의 필수 blocking check입니다. +- SCA/SBOM/dependency vulnerability는 별도 future track입니다. + +Opt-in 사용 예시는 다음과 같습니다. + +```bash +uv run security-scanner import-sarif \ + --sarif examples/code-vuln-semgrep.example.sarif \ + --output private/vuln-findings.jsonl + +uv run security-scanner report --category code-vuln --findings private/vuln-findings.jsonl +uv run security-scanner gate --category code-vuln --findings private/vuln-findings.jsonl --max 0 +uv run security-scanner evaluate \ + --category code-vuln \ + --expected eval/synthetic-code-vuln/expected-findings.example.json \ + --findings private/vuln-findings.jsonl +``` + +`scan-vuln`은 실제 local checkout을 대상으로 하므로 기본적으로 +`--path-policy redacted`를 사용해 상대 경로까지 hash합니다. Synthetic SARIF +fixture를 직접 넣는 `import-sarif`만 기본값이 `synthetic`이며, private proof에서 +path까지 숨겨야 하면 `import-sarif --path-policy redacted`를 사용합니다. + ## 도구별 역할 - Gitleaks는 secret 후보를 찾는 기준 도구입니다. @@ -34,7 +74,11 @@ 공개 문서는 tool role과 decision rationale만 설명합니다. 비공개 benchmark data, 민감한 alert data, internal repository context, private provider endpoint는 제외합니다. -## 노이즈 필터 위치 결정 +## Secret Detection 노이즈 필터 위치 결정 + +이 결정은 Gitleaks-first Secret Detection 경로에만 적용합니다. Phase 2a +`code-vuln` SAST 경로의 normalization, suppression, triage 정책은 별도 +`VULN_FINDING` contract에서 결정합니다. | 필터 위치 후보 | 장점 | 단점 | 선택 여부 | | --- | --- | --- | --- | diff --git a/docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md b/docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md new file mode 100644 index 0000000..585080f --- /dev/null +++ b/docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md @@ -0,0 +1,287 @@ +# Agentic Workflow: Phase 2a SARIF-native SAST Product Completion + +**Status:** Ready for long single-goal execution +**Date:** 2026-06-20 +**Goal ID:** `phase-2a-sarif-product-complete` +**Spec:** `docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md` +**Design:** `docs/workbench/specs/phase-2a-sarif-native-sast/design.md` +**Merge flow:** pull request + +이 문서는 기존 M1 import-first packet을 폐기하지 않고, 같은 실행 흐름을 +M1~M4 제품완성 long-single-goal로 승격한 실행 패킷이다. 목표는 broad Phase 2a +전체가 아니라 `SARIF import -> VULN_FINDING artifact -> report/gate/evaluate -> +Semgrep-compatible analyzer -> vulnerability LLM verifier/explainer`까지의 countable +제품 slice를 끝까지 닫는 것이다. + +## Goal + +Phase 2a SARIF-native SAST 제품 slice를 PR/CI/merge까지 완성한다. + +완료 기준: + +- M1: Synthetic SARIF 2.1.0 fixtures를 deterministic public-safe `VULN_FINDING` + JSONL artifact로 정규화하는 `import-sarif` 흐름이 있다. +- M2: `report`, `gate`, `evaluate`가 기존 secret default를 유지하면서 + `--category code-vuln` 또는 equivalent opt-in으로 `VULN_FINDING` artifact를 다룬다. +- M3: Semgrep-compatible analyzer adapter가 local checkout에서 SARIF를 생성하고, + 기존 SARIF importer를 재사용한다. 기본 engine은 Semgrep CE-compatible CLI + (`semgrep`)로 둔다. +- M4: Vulnerability LLM verifier/explainer가 redacted rule/CWE/OWASP/location-shape + metadata만 받아 strict JSON/fail-closed review assistance를 만든다. +- 기존 Gitleaks-first Secret Detection default path는 변하지 않는다. +- GHAS scan trigger, SARIF upload, alert mutation, live fetch는 없다. +- Architecture review gate가 구현 전, M2/M3 이후, merge 전 final 단계에서 실행되고 + blocking finding이 없어야 한다. +- PR CI와 local governance gate가 모두 통과한다. + +## Execution Contract + +- 단일 장기 goal로 M1~M4를 끝까지 진행한다. +- 중간 milestone 사용자 승인은 요구하지 않는다. +- 사람 개입은 stop condition 발생 시에만 요청한다. +- Subagent를 적극 사용한다. 구현 worker는 `gpt-5.5`, `reasoning_effort: high`; + 보조 coding/review는 repo policy에 맞춘다. +- PR을 만들고 CI를 통과시킨 뒤 merge 가능 상태까지 닫는다. +- 실제 endpoint, host, credential, private path, real SARIF/GHAS export, real finding + output은 커밋하지 않는다. + +## Fixed Decisions + +- Scope: M1~M4 product-complete slice. +- Persistence: JSONL artifact-first. DynamoDB-compatible projection 또는 storage schema + migration은 이번 goal 밖이며, 필요해지는 순간 stop condition이다. +- Category wire value: `code-vuln`. +- Model boundary: `core/vulnerability` 또는 equivalent 별도 module. +- `finding_id`: SARIF `partialFingerprints` 우선, 없으면 + `source_tool + rule_id + normalized synthetic path + start_line + message` stable hash. +- Path handling: committed fixture import는 synthetic-only path를 허용한다. + `scan-vuln` 실제 checkout scan은 기본 `redacted` path policy로 relative path까지 + hash한다. +- Analyzer: Semgrep CE-compatible CLI를 기본 실행 target으로 둔다. Adapter boundary는 + Semgrep-compatible으로 유지해 OpenGrep 교체 가능성을 남긴다. +- Gate policy: severity + precision threshold. 기본 fail 기준은 implementation 중 + synthetic fixture와 SARIF metadata에 맞춰 보수적으로 정하되, secret `gate --max 0` + semantics와 섞지 않는다. +- LLM: detector가 아니라 verifier/explainer only. Strict JSON, confidence threshold, + fail-closed `NEEDS_REVIEW` behavior. Raw code snippet 금지. +- GHAS: out of scope. Live fetch, upload, mutation 금지. +- SCA/SBOM: out of scope. + +## Required Architecture Review Gate + +Architecture review is mandatory and blocking. + +Required checkpoints: + +1. **Pre-implementation architecture review** + - Review spec/design/workflow before code changes. + - Confirm M1~M4 scope, module seams, write surface, stop conditions. + +2. **Post-M2 codebase architecture review** + - After `VULN_FINDING` model, artifact, report/gate/evaluate integration. + - Confirm secret `Finding` remains isolated and no default behavior changed. + +3. **Post-M3 adapter architecture review** + - After Semgrep-compatible adapter wiring. + - Confirm adapter emits/consumes SARIF through the canonical importer and no + analyzer-specific JSON becomes the internal contract. + +4. **Final architecture review** + - Before PR ready/merge. + - Confirm M1~M4 evidence, tests, public-safety, rollback boundaries, and docs. + +Blocking architecture findings stop the run only if they require SoT change, +scope expansion, unsafe data handling, or existing secret workflow behavior change. +Otherwise, the implementing agent fixes them inside the same goal. + +## Multi-agent Execution Model + +Use subagents by disjoint responsibility. Main agent integrates results and owns +final judgment. + +| Role | Responsibility | Write scope | +| --- | --- | --- | +| `system_architecture_manager` | Architecture gate, SoT drift, product/system boundary | read-only | +| `codebase_architecture_manager` | Codebase seam review, module locality, implementation architecture | read-only | +| Worker A | `VulnerabilityFinding`, SARIF importer, JSONL artifact | `src/security_scanner/core/vulnerability/**`, tests | +| Worker B | CLI and artifact report/gate/evaluate integration | `src/security_scanner/cli/**`, `src/security_scanner/core/report/**`, `src/security_scanner/core/policy/**`, tests | +| Worker C | Semgrep-compatible adapter | `src/security_scanner/scanners/**`, tests | +| Worker D | Vulnerability verifier/explainer | `src/security_scanner/llm/**`, tests | +| Reviewer | Public-safety/security review | read-only | +| `code_simplifier` | Final clarity/refactor pass preserving behavior | touched implementation files only | + +Workers are not alone in the codebase. They must not revert other workers' +edits; they adapt to integrated changes. + +## Allowed Write Surface + +실행 agent는 `governance/autopilot_goal.yml`의 `allowed_writes`를 authoritative scope로 +사용한다. 요약하면 다음 표면만 변경한다. + +- `docs/workbench/specs/phase-2a-sarif-native-sast/**` +- `docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md` +- `docs/views/research-and-technical-decisions.md` +- `src/security_scanner/**` +- `tests/**` +- `examples/**` +- `eval/**` +- `governance/**` +- `ledger/**` +- `CURRENT.md` + +이 범위를 벗어나는 변경이 필요하면 scope expansion으로 간주하고 멈춘다. + +## Suggested Work Plan + +### Readiness Gate + +1. Read current contracts. + - `AGENTS.md` + - `governance/autopilot_goal.yml` + - this workflow document + - `docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md` + - `docs/workbench/specs/phase-2a-sarif-native-sast/design.md` + - `docs/workbench/specs/phase-2a-sarif-native-sast/review.md` +2. Run pre-implementation architecture review gate. +3. Confirm working tree isolation and allowed write surface. + +### M1 SARIF import and artifact + +1. Add failing tests first: + - `VulnerabilityFinding` model serializes without secret fields. + - SARIF importer handles minimal result, multiple rules, missing optional metadata, + `partialFingerprints`, and code flow count. + - JSONL writer/reader round-trips `VULN_FINDING` records deterministically. + - `import-sarif` writes JSONL artifact from synthetic fixture. + - `scan-vuln` defaults to `--path-policy redacted` for private-proof artifacts. + - `import-sarif --path-policy redacted` hashes relative paths for private SARIF + proof artifacts. + - Existing `scan`, `report`, `gate`, `evaluate` default tests still pass. +2. Implement: + - `core/vulnerability` or equivalent module. + - stdlib JSON SARIF importer. + - JSONL artifact writer/reader. + - opt-in `import-sarif` CLI. + +### M2 Report, gate, evaluate + +1. Add tests for `--category code-vuln` or equivalent opt-in. +2. Implement artifact-based report/gate/evaluate without changing secret defaults. +3. Gate uses severity + precision policy, isolated from secret `gate --max 0`. +4. Run post-M2 codebase architecture review gate. + +### M3 Semgrep-compatible adapter + +1. Add tests using fake command runner and synthetic output. +2. Implement Semgrep-compatible CLI adapter that emits SARIF. +3. Route analyzer output through the canonical SARIF importer. +4. Sanitize analyzer failure stdout/stderr before surfacing CLI errors. +5. Do not make analyzer-specific JSON the internal contract. +6. Run post-M3 adapter architecture review gate. + +### M4 Vulnerability verifier/explainer + +1. Add tests for redacted prompt/input: + - no raw source snippet + - no private path + - no real repo name + - strict JSON + - invalid output/timeout/low confidence => review-needed +2. Implement vulnerability-specific prompt/application adapter. +3. Reuse only shared strict JSON / confidence / fail-closed behavior from secret verifier. +4. Do not auto-dismiss, suppress, or patch code from LLM output. + +### Finalization + +1. Update docs with current-support language that stays opt-in/experimental. +2. Run final architecture review gate. +3. Run full checks. +4. Stage files by name, commit, push PR, wait for CI, merge when green. + +## Required Local Checks + +Run targeted checks by milestone: + +```bash +uv run pytest tests/test_vulnerability_finding.py tests/test_sarif_importer.py tests/test_cli_import_sarif.py +uv run pytest tests/test_code_vuln_report_gate.py tests/test_vulnerability_evaluation.py +uv run pytest tests/test_semgrep_compatible_adapter.py +uv run pytest tests/test_vulnerability_verifier.py +``` + +Run full checks before PR creation and before marking the goal complete: + +```bash +uv run pytest +uv run python -m governance.render --validate +uv run python -m governance.render --check +uv run python -m governance.rebuild_ledger_index --check +uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check +uv run python -m governance.public_safety --diff origin/main...HEAD +uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md +uv run python -m governance.autopilot_gate --base origin/main +``` + +Architecture gate evidence must be captured in the PR summary or a workbench +review note before merge. + +## Stop Conditions + +Stop and ask for human input only when one of these occurs. + +- public-safety hit that cannot be resolved without deleting or redacting committed content +- required implementation path outside `allowed_writes` +- architecture review requires SoT change +- Semgrep-compatible adapter cannot be implemented without committing real scan output +- need to persist real private paths, raw source snippets, real SARIF, or real GHAS export +- GHAS trigger, upload, alert mutation, dismissal, or live fetch becomes necessary +- existing secret scan/report/gate/evaluate default would change +- storage projection or schema migration becomes necessary +- protected branch or PR permission failure +- same blocker repeats three times with no new evidence +- break-glass state becomes active + +## Resume Prompt + +Use this prompt to start or resume the long run. + +```text +Goal: complete `phase-2a-sarif-product-complete` in the security-scanner repo through a PR. + +Read first: +- AGENTS.md +- governance/autopilot_goal.yml +- docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md +- docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md +- docs/workbench/specs/phase-2a-sarif-native-sast/design.md +- docs/workbench/specs/phase-2a-sarif-native-sast/review.md +- src/security_scanner/core/finding/model.py +- src/security_scanner/storage/jsonl_store.py +- src/security_scanner/cli/app.py +- src/security_scanner/cli/commands/scan.py +- src/security_scanner/llm/common/verifier.py +- src/security_scanner/scanners/gitleaks/runner.py + +Implement M1~M4 product slice: +M1 SARIF import -> VULN_FINDING JSONL artifact. +M2 report/gate/evaluate --category code-vuln from artifact. +M3 Semgrep-compatible adapter emitting SARIF through the canonical importer. +M4 vulnerability verifier/explainer using redacted metadata only. + +Use multi-agent execution aggressively. Mandatory architecture gates: +pre-implementation, post-M2, post-M3, final. Do not ask for approval unless a +listed stop condition occurs. Keep existing Gitleaks-first secret defaults +unchanged. Do not call GHAS, upload SARIF, mutate alerts, persist raw snippets, +or commit real scan data. Finish by opening a PR, waiting for CI, and merging +when green. + +Required checks: +- uv run pytest +- uv run python -m governance.render --validate +- uv run python -m governance.render --check +- uv run python -m governance.rebuild_ledger_index --check +- uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check +- uv run python -m governance.public_safety --diff origin/main...HEAD +- uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md +- uv run python -m governance.autopilot_gate --base origin/main +``` diff --git a/docs/workbench/specs/phase-2a-sarif-native-sast/architecture-gates.md b/docs/workbench/specs/phase-2a-sarif-native-sast/architecture-gates.md new file mode 100644 index 0000000..38af7b8 --- /dev/null +++ b/docs/workbench/specs/phase-2a-sarif-native-sast/architecture-gates.md @@ -0,0 +1,65 @@ +# Phase 2a SARIF-native SAST Architecture Gate Evidence + +## Pre-implementation gate + +Status: passed after fix. + +- `system_architecture_manager` initially found a blocking SoT mismatch: workflow + allowed possible DynamoDB-compatible projection, while requirements/design kept + projection outside this goal. +- Fix: workflow persistence language now says JSONL artifact-first only; storage + projection or schema migration is a stop condition. +- `codebase_architecture_manager` found implementation seams non-blocking: + `core/vulnerability`, `storage/vulnerability_jsonl_store.py`, category-aware + CLI dispatch, Semgrep-compatible adapter, and vulnerability verifier module. + +## Post-M2 gate + +Status: passed with caveat addressed. + +- Finding isolation: `core.finding.Finding` remains secret-specific. +- Code vulnerability model: `core.vulnerability.VulnerabilityFinding` is separate. +- Persistence: `VULN_FINDING` JSONL artifact is canonical for `code-vuln`. +- Category dispatch: `report`, `gate`, and `evaluate` require explicit + `--category code-vuln`; secret default path remains unchanged. +- Caveat: future private proof needed a deeper location policy seam. +- Fix: SARIF import now supports `--path-policy redacted` to hash relative paths. + +## Post-M3 gate + +Status: passed after fix. + +- Adapter contract: Semgrep-compatible runner emits SARIF and runtime imports it + through the canonical SARIF importer. +- Internal contract: analyzer-specific JSON is not used. +- Dependency contract: no Semgrep Python dependency or installer behavior is added. +- Blocking finding: analyzer failure detail originally passed raw stdout/stderr + through `SemgrepExecutionError`. +- Fix: adapter sanitizes path-like and secret-like failure detail before CLI output. +- Added proof: fake runner writes SARIF, runtime imports it, and `VULN_FINDING` + JSONL artifact is produced. + +## Final gate + +Status: passed after blocking fixes. + +- First blocking finding: SARIF `message`, `help.markdown`, and `helpUri` were + untrusted free text and could persist into JSONL or M4 verifier prompt. +- Fix: central vulnerability redaction helpers sanitize free text, rule links, + path-like content, snippet-like content, and secret-like tokens before + persistence; M4 prompt also re-sanitizes rule help. +- Second blocking finding: other SARIF string metadata such as tool name, rule id, + rule name, tags, and partial fingerprints were also untrusted input surfaces. +- Fix: importer and `VulnerabilityFinding` entity boundary now sanitize + identifier-like metadata; partial fingerprint values are stored as hashes. +- Codebase blocking finding: `scan-vuln` targets real local checkouts but + originally defaulted to synthetic path preservation. +- Fix: `scan-vuln` now defaults to `--path-policy redacted`, while + `import-sarif` keeps `synthetic` as the committed-fixture default. +- Additional hardening: `VulnerabilityJsonlStore` validates the nested + `category == "code-vuln"` contract. +- Final `system_architecture_manager` gate passed: no GHAS fetch/upload/mutation, + no SCA, no new Python dependency, secret defaults preserved. +- Final `codebase_architecture_manager` gate passed: SARIF adapter feeds the + canonical importer, `code-vuln` remains JSONL-only opt-in, and malicious SARIF + metadata regression tests cover JSONL, report, and prompt leakage. diff --git a/docs/workbench/specs/phase-2a-sarif-native-sast/design.md b/docs/workbench/specs/phase-2a-sarif-native-sast/design.md new file mode 100644 index 0000000..a971558 --- /dev/null +++ b/docs/workbench/specs/phase-2a-sarif-native-sast/design.md @@ -0,0 +1,438 @@ +# Phase 2a SARIF-native SAST Design Spec + +## 개요 + +Phase 2a는 현재 Gitleaks-first Secret Detection pipeline을 보존하면서, +code vulnerability alert를 opt-in `code-vuln` category로 다루는 SARIF-first 제품 +slice다. 기존 M1 import-first packet을 유지하되, 같은 long-single-goal 안에서 M1~M4까지 +제품완성 scope로 닫는다. + +M1은 `import-sarif -> VULN_FINDING JSONL artifact`다. M2는 report/gate/evaluate +통합이다. M3는 Semgrep-compatible analyzer adapter다. M4는 redacted metadata 기반 +vulnerability verifier/explainer다. + +## 요구사항 참조 + +- Source of truth: `requirements.md` +- 핵심 요구사항: + - Secret Detection default path는 변경하지 않는다. + - Code vulnerability는 별도 `VULN_FINDING` / `VulnerabilityFinding` entity로 둔다. + - Analyzer보다 SARIF-compatible contract를 먼저 고정한다. + - `code-vuln` persistence는 JSONL artifact-first다. + - Report/gate/evaluate는 explicit category opt-in만 허용한다. + - First engine target은 Semgrep CE-compatible CLI `semgrep`이다. + - GHAS는 scan trigger/mutation/fetch/upload 없이 future reference로만 둔다. + - LLM은 detector가 아니라 verifier/explainer/generic remediation assistant다. + - Architecture review gate는 mandatory blocking check다. + - SCA/SBOM/dependency vulnerability는 future track이다. + +## 접근 후보 + +### 후보 A: SARIF-first model + analyzer adapter (선택) + +먼저 SARIF-compatible fixture를 normalized `VULN_FINDING`으로 바꾸는 contract를 만든다. +그 위에 Semgrep-compatible adapter, category-aware report/gate/evaluate, LLM +verifier/explainer를 붙인다. + +장점: + +- Analyzer 교체 비용이 낮다. +- GitHub code scanning과 conceptual shape가 맞다. +- Synthetic fixture 기반 regression이 쉽다. +- Public-safe redaction boundary를 analyzer 실행 전에 고정할 수 있다. +- M1~M4가 같은 canonical importer를 공유한다. + +단점: + +- 첫 milestone에서는 scanner 실행보다 model/artifact 검증이 먼저 보인다. + +### 후보 B: Semgrep/OpenGrep execution-first + +처음부터 Semgrep-compatible engine을 실행하고 그 결과를 내부 모델로 매핑한다. + +장점: + +- 사용자 관점의 실행 가능성이 빨리 보인다. +- Local-first와 fixture 기반 테스트에 잘 맞는다. + +단점: + +- Internal model이 특정 tool output에 끌려갈 위험이 있다. +- Analyzer-specific JSON이 contract가 될 수 있다. + +### 후보 C: CodeQL-first GHAS parity + +CodeQL CLI를 첫 adapter로 삼고 GHAS code scanning과 최대한 같은 shape를 따른다. + +장점: + +- GHAS code scanning과 가장 직접적으로 연결된다. +- Code flow, query metadata, precision/security-severity 모델이 강하다. + +단점: + +- Build mode, language support, runtime environment, license boundary가 현재 제품 slice에 과하다. +- Local-first synthetic workflow보다 analyzer setup이 먼저 커진다. + +### 후보 D: Broad SAST+SCA scanner + +SAST와 SCA를 함께 넣어 GHAS feature list에 더 가까운 scanner로 넓힌다. + +장점: + +- 기능 목록은 풍부해 보인다. + +단점: + +- Product boundary, data model, gate, evaluation이 한 번에 흐려진다. +- 현재 small reproducible pipeline 철학과 충돌한다. + +**선택: 후보 A.** M3에서 Semgrep-compatible execution을 붙이지만, 내부 canonical +contract는 끝까지 SARIF importer다. + +## 아키텍처 + +```text +Existing secret path: +targets.local.yaml -> workspace -> Gitleaks -> Finding -> store -> report/gate/evaluate + +Phase 2a code-vuln path: +synthetic SARIF file + -> SarifImporter + -> VulnerabilityFinding + -> VULN_FINDING JSONL artifact + -> report/gate/evaluate --category code-vuln + -> VulnerabilityVerifier + +Semgrep-compatible path: +local checkout -> SemgrepCompatibleAdapter -> SARIF file + -> SarifImporter -> same code-vuln path +``` + +Phase 2a path는 기존 secret path를 대체하지 않는다. Runtime command는 명시 opt-in이어야 +하며, default report/gate behavior는 secret category 기준으로 남는다. + +## Component details + +### `SarifImporter` + +- 역할: SARIF 2.1.0-compatible input을 내부 result records로 읽는다. +- 입력: public-safe synthetic SARIF fixture 또는 analyzer가 만든 SARIF file. +- 출력: normalized records for `VulnerabilityFinding`. +- 책임: + - `run.tool.driver.name`, `semanticVersion`, `rules`, `results`, `locations`, + `partialFingerprints`, `codeFlows`, `relatedLocations`를 보존 가능한 구조로 읽는다. + - Raw snippet 또는 source text는 기본 저장하지 않는다. + - `synthetic` path policy에서는 public fixture 상대 경로를 보존한다. + - `redacted` path policy에서는 상대 경로까지 hash해 private proof artifact를 만든다. + - Unsupported SARIF fields는 analyzer-specific metadata로 격리한다. + - Missing optional metadata를 drop이 아니라 incomplete metadata로 표현한다. +- 비책임: + - Analyzer 실행. + - GHAS upload/fetch. + - GitHub alert mutation. + +### `VulnerabilityFinding` + +- 역할: code vulnerability alert의 canonical domain model. +- 권장 field: + - `finding_id` + - `category = "code-vuln"` + - `source_tool` + - `source_tool_version` + - `rule_id` + - `rule_name` + - `message` + - `severity` + - `precision` + - `security_severity` + - `cwe_ids` + - `owasp_tags` + - `primary_location` + - `related_locations` + - `code_flow_count` + - `partial_fingerprints` + - `help_uri` + - `help_markdown` + - `triage_state` + - `verifier_verdict` +- Public-safe default: + - Committed path는 synthetic-only로 제한한다. + - Private proof path는 hash/redaction만 허용한다. + - Raw source snippet은 저장하지 않는다. + - Real GHAS URL, alert URL, user object, private path는 저장하지 않는다. +- Module boundary: + - `core/vulnerability` 또는 equivalent 별도 module에 둔다. + - 기존 `core/finding`의 secret-specific `Finding`을 일반화하지 않는다. + - Wire category value는 `code-vuln`으로 고정한다. + +### Finding identity + +`finding_id`는 deterministic이어야 한다. + +우선순위: + +1. SARIF `partialFingerprints` 중 tool-stable fingerprint가 있으면 + `source_tool + rule_id + fingerprint`를 hash material로 쓴다. +2. Fingerprint가 없으면 + `source_tool + rule_id + normalized synthetic path + start_line + message`를 + fallback hash material로 쓴다. + +Committed fixture는 synthetic path만 사용한다. Private proof에서 path가 필요하면 +hash/redacted path만 허용하고 raw private path는 artifact에 쓰지 않는다. + +### `VulnerabilityFindingJsonlStore` + +- 역할: `VULN_FINDING` artifact를 line-oriented JSONL로 쓰고 읽는다. +- 입력: normalized `VulnerabilityFinding`. +- 출력: schema-versioned JSONL records. +- 책임: + - `entityType = "VULN_FINDING"`와 `schemaVersion = 1`을 기록한다. + - 같은 input fixture가 같은 line ordering과 same-value records를 만들게 한다. + - Round-trip read/write tests를 제공한다. +- 비책임: + - DynamoDB-compatible projection. + - Repo-axis/list-axis indexing. + - GHAS comparison. + +### `CodeVulnReporter` + +- 역할: `VULN_FINDING` JSONL artifact를 사람이 읽을 수 있는 report로 변환한다. +- 입력: JSONL artifact, category option. +- 출력: summary, finding rows, severity/precision breakdown. +- 책임: + - `--category code-vuln`이 명시될 때만 실행한다. + - Secret report default와 CLI output contract를 바꾸지 않는다. + - Raw SARIF body나 source snippet을 report에 복사하지 않는다. + +### `CodeVulnGate` + +- 역할: `code-vuln` finding set을 severity + precision threshold로 판정한다. +- 입력: JSONL artifact, threshold options. +- 출력: pass/fail + count summary. +- 책임: + - Secret `gate --max 0` semantics와 분리한다. + - Default policy는 보수적으로 high-severity/high-precision finding fail을 기준으로 한다. + - Missing severity/precision은 fail-open이 아니라 review-required로 세분화한다. + +### `CodeVulnEvaluator` + +- 역할: synthetic vulnerable corpus의 expected findings와 normalized findings를 비교한다. +- 입력: expected finding contract, actual `VULN_FINDING` artifact. +- 출력: TP/FP/FN summary와 deterministic regression signal. +- 책임: + - Analyzer 전후와 verifier 전후 지표를 분리할 수 있게 한다. + - Real source code나 real finding evidence를 fixture로 요구하지 않는다. + +### `SemgrepCompatibleAdapter` + +- 역할: Semgrep CE-compatible CLI를 실행해 SARIF output을 만든다. +- 입력: local checkout, rule pack/config, output path. +- 출력: SARIF file. +- 책임: + - Engine-specific command construction과 output capture를 adapter 내부로 제한한다. + - `semgrep` CLI를 기본 command로 사용하되, command runner를 주입 가능하게 해 tests에서 fake runner를 쓴다. + - Internal model은 engine-specific JSON이 아니라 SARIF importer를 통해 받는다. + - Analyzer failure는 partial success로 포장하지 않는다. + - Analyzer stdout/stderr failure detail은 public-safe하게 sanitize한 뒤 CLI로 전달한다. +- 비책임: + - Semgrep 설치 자동화. + - GHAS upload. + - Dependency/SCA scan. + +### `VulnerabilityVerifier` + +- 역할: analyzer가 만든 `VULN_FINDING`에 review assistance를 붙인다. +- 입력: + - rule id/name + - CWE/OWASP tags + - analyzer severity/precision + - primary location의 redacted shape + - trace cardinality + - rule help summary +- 출력: + - strict JSON verdict + - confidence + - public-safe reason + - optional generic remediation guidance +- 재사용: + - 기존 secret verifier의 strict JSON parsing, confidence threshold, fail-closed behavior는 공유할 수 있다. +- 금지: + - raw code snippet 요구. + - private path 또는 real repo name prompt 포함. + - finding 삭제, terminal dismissal, code patch 생성. + +### GHAS relationship + +현재 `GHAS_ALERT`는 secret scanning comparison metadata다. Code scanning alert와 같은 +entity로 섞지 않는다. 향후 필요하면 `GHAS_CODE_ALERT` 같은 별도 record를 설계한다. +`GHAS_CODE_ALERT`는 repo-level read-only fetch only, no workflow dispatch, +no SARIF upload, no alert dismissal, no raw response persistence를 기본 guardrail로 +둔다. 이번 goal에는 GHAS 호출이 없다. + +## CLI and runtime proposal + +기존 `scan` default는 secret scan으로 유지한다. + +Product-complete CLI surface: + +- `import-sarif`: SARIF file을 `VULN_FINDING` JSONL artifact로 정규화. +- `import-sarif --path-policy redacted`: relative path까지 hash하는 private-proof import. +- `scan-vuln`: Semgrep-compatible adapter를 실행하고 SARIF importer로 연결한다. 실제 local checkout scan이므로 기본 path policy는 `redacted`다. +- `report --category code-vuln`: code vulnerability artifact를 report. +- `gate --category code-vuln`: severity/precision threshold 기반 gate. +- `evaluate --category code-vuln`: expected vulnerable corpus와 비교. +- `verify --category code-vuln` 또는 equivalent verifier entrypoint: redacted metadata 기반 LLM review assistance. + +명령 이름은 repo의 기존 CLI 구조를 따라 implementation 중 결정하되, secret default behavior를 +바꾸는 이름/alias는 금지한다. + +## Data flow + +### M1 Import flow + +1. Synthetic SARIF fixture를 준비한다. +2. `SarifImporter`가 SARIF records를 읽는다. +3. Normalizer가 path policy에 따라 synthetic relative path를 보존하거나 redacted hash로 바꾼다. `scan-vuln`은 기본적으로 redacted hash를 사용하고, `import-sarif` synthetic fixture 경로만 opt-in 없이 보존한다. +4. Normalizer가 `VulnerabilityFinding`을 만든다. +5. `VulnerabilityFindingJsonlStore`가 `VULN_FINDING` JSONL artifact를 만든다. +6. Round-trip tests가 artifact determinism을 검증한다. + +### M2 Report/gate/evaluate flow + +1. User가 `--category code-vuln`을 명시한다. +2. JSONL reader가 `VULN_FINDING` artifact를 읽는다. +3. Reporter는 severity/precision/rule/CWE summary를 만든다. +4. Gate는 severity + precision threshold로 pass/fail을 반환한다. +5. Evaluator는 expected finding contract와 비교한다. + +### M3 Analyzer execution flow + +1. User가 명시 opt-in command를 실행한다. +2. `SemgrepCompatibleAdapter`가 local checkout에서 analyzer를 실행한다. +3. Analyzer output은 SARIF로 저장된다. +4. 이후 flow는 M1 importer와 동일하다. + +### M4 Verifier flow + +1. `VULN_FINDING`에서 public-safe metadata만 추출한다. +2. Prompt/application adapter가 raw snippet/path/repo name을 제외한다. +3. LLM response는 strict JSON으로 parse된다. +4. Invalid JSON, low confidence, timeout, transport failure는 `NEEDS_REVIEW`로 남는다. +5. Verifier output은 report 보조 설명이며 terminal dismissal 근거가 아니다. + +## Error handling + +- SARIF parse failure: public-safe parse error로 실패하고 raw SARIF body를 log/report에 복사하지 않는다. +- Missing rule metadata: finding을 drop하지 않고 incomplete metadata 상태로 남긴다. +- Missing location metadata: comparable location 없음으로 표시하고 aggregate에서 누락하지 않는다. +- JSONL read failure: line number와 schema error만 보고하고 raw sensitive field를 echo하지 않는다. +- Analyzer command missing: adapter configuration/runtime failure로 보고하고 importer success처럼 포장하지 않는다. +- Analyzer execution failure: analyzer adapter 단계 실패로 보고하고 partial normalized output을 성공으로 주장하지 않는다. Raw stdout/stderr detail은 sanitize한다. +- LLM timeout/transport/invalid JSON/low confidence: review-needed로 남긴다. +- GHAS access path: 이번 goal에서는 호출하지 않는다. + +## Testing strategy + +M1: + +- `VulnerabilityFinding` serialization excludes secret/raw snippet fields. +- SARIF importer handles minimal valid result, multiple rules, missing optional metadata, + primary location, code flow count, partial fingerprints. +- Deterministic `finding_id`. +- JSONL writer/reader round-trip. +- `import-sarif` CLI writes expected artifact from synthetic fixture. + +M2: + +- `report --category code-vuln` reads artifact and summarizes without changing secret default. +- `gate --category code-vuln` enforces severity + precision policy. +- `evaluate --category code-vuln` compares expected vs actual deterministically. +- Existing secret report/gate/evaluate tests still pass. + +M3: + +- Semgrep-compatible adapter builds expected command with SARIF output. +- Fake command runner tests success and failure. +- Adapter output is consumed through `SarifImporter`. +- Analyzer-specific JSON is not used as internal contract. + +M4: + +- Prompt excludes raw source snippet, private path, real repo name. +- Strict JSON accepted. +- Invalid output, timeout, low confidence become review-needed. +- Remediation text is generic and non-terminal. + +Cross-cutting: + +- Public-safety tests confirm fixtures are synthetic and no real GHAS export/private finding appears. +- Architecture review gate runs pre-implementation, post-M2, post-M3, final. + +## Milestones + +- M1a: `VulnerabilityFinding` model + SARIF fixture importer contract. +- M1b: `VULN_FINDING` JSONL artifact writer/reader + round-trip tests. +- M1c: `import-sarif` CLI + public-safety tests. +- M2a: Category-aware report from `VULN_FINDING` artifact. +- M2b: Category-aware severity/precision gate. +- M2c: Category-aware evaluation from synthetic expected findings. +- M2 gate: Codebase architecture review. +- M3a: Semgrep-compatible command runner abstraction. +- M3b: Adapter emits SARIF and reuses importer. +- M3 gate: Adapter architecture review. +- M4a: Vulnerability verifier prompt/input adapter. +- M4b: Strict JSON/fail-closed verifier result handling. +- M4c: Redaction/leakage regression tests. +- Final gate: Architecture review + full governance/test/public-safety checks. + +## Architecture review gate + +Architecture review is mandatory and blocking. + +Required checkpoints: + +1. Pre-implementation: spec/design/workflow, write surface, stop conditions. +2. Post-M2: model/artifact/report/gate/evaluate seam and secret default preservation. +3. Post-M3: adapter boundary and canonical SARIF importer use. +4. Final: M1~M4 evidence, tests, public-safety, rollback boundaries, docs. + +Blocking finding categories: + +- SoT change required. +- Scope expansion outside allowed writes. +- Raw snippet/private path persistence risk. +- GHAS live fetch/upload/mutation pressure. +- Existing secret workflow behavior change. +- Analyzer-specific JSON becoming canonical. + +Non-blocking review comments are fixed inside the same long goal without asking for user approval. + +## Rollback criteria + +- README의 현재 지원 기능처럼 Phase 2a를 default feature로 홍보하게 되면 rollback한다. +- Existing secret scan/report/gate/evaluate default가 바뀌면 rollback한다. +- `Finding`을 대형 일반화해 secret-specific tests를 흔들면 rollback한다. +- Analyzer-specific JSON이 internal canonical contract가 되면 rollback한다. +- GHAS scan trigger, upload, mutation, dismissal 자동화가 들어오면 rollback한다. +- LLM이 raw snippet을 받거나 detector/auto-dismissal owner가 되면 rollback한다. +- SCA/SBOM implementation이 M1~M4 acceptance criteria에 섞이면 rollback한다. +- Architecture review gate가 생략되면 rollback한다. + +## Do not build in this goal + +- Existing secret `Finding` migration. +- Gitleaks default scanner replacement. +- GHAS live fetch or upload. +- GHAS alert mutation, dismissal, campaign automation. +- Real GHAS alert/export fixture. +- Real source snippet, real repository path, real private finding sample. +- Raw-code LLM detector. +- Public-safe mode code patch generator. +- SCA/SBOM/dependency vulnerability implementation. +- DynamoDB-compatible projection or managed storage migration. + +## 후속 결정 + +- GHAS code scanning alert read-only comparison: Phase 2b 또는 별도 GHAS comparison track. +- SCA/SBOM/dependency vulnerability scanner: 별도 product line 시작 시점. diff --git a/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.html b/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.html new file mode 100644 index 0000000..57061d8 --- /dev/null +++ b/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.html @@ -0,0 +1,190 @@ + + + + + + Phase 2a SARIF-native SAST Requirements + + + +
+

Phase 2a SARIF-native SAST Requirements

+

requirements.md가 source of truth입니다. 이 HTML은 M1~M4 제품완성 scope를 빠르게 검토하기 위한 preview companion입니다.

+ +
+

핵심 결정

+ M1~M4 product-complete + SARIF-first + opt-in code-vuln + 별도 VULN_FINDING + JSONL artifact-first + Semgrep CE-compatible + LLM verifier/explainer + architecture gate 필수 + GHAS trigger 금지 +
+ +
+
+

유지

+
    +
  • Gitleaks-first Secret Detection 기본 경로
  • +
  • 기존 secret Finding 모델
  • +
  • Local-first, synthetic-first evaluation
  • +
+
+
+

완성

+
    +
  • M1 import-sarif + JSONL artifact
  • +
  • M2 report/gate/evaluate --category code-vuln
  • +
  • M3 Semgrep-compatible adapter
  • +
  • M4 vulnerability verifier/explainer
  • +
+
+
+

제외

+
    +
  • GHAS scan trigger/upload/mutation/live fetch
  • +
  • Raw snippet 기반 LLM detector
  • +
  • SCA/SBOM/dependency scanner
  • +
+
+
+ +
+

수용 기준 요약

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
영역요구
Goalphase-2a-sarif-product-complete 단일 장기 goal로 M1~M4를 닫는다.
Architecture gatePre-implementation, post-M2, post-M3, final 단계에서 mandatory/blocking review를 실행한다.
모델 경계Secret Finding을 유지하고 code vulnerability는 별도 entity로 둔다.
AnalyzerSemgrep CE-compatible CLI가 SARIF를 만들고 canonical importer를 재사용한다.
안전 경계Real GHAS alert/export, raw source snippet, private path, secret을 committed fixture에 남기지 않는다.
+
+ +
+

Do not build in this goal

+ +
+
+ + diff --git a/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md b/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md new file mode 100644 index 0000000..83a1373 --- /dev/null +++ b/docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md @@ -0,0 +1,160 @@ +# Phase 2a SARIF-native SAST Requirements + +## 승인 대상 + +- Source of truth: `requirements.md` +- Preview companion: `requirements.html` +- Design companion: `design.md` +- Review companion: `review.md` +- 실행 패킷: `docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md` +- 범위: 기존 M1 import-first packet을 폐기하지 않고, 같은 packet을 + M1~M4 제품완성 long-single-goal로 승격한다. +- 상태: 구현 승인된 제품완성 준비 문서다. 중간 milestone 사용자 승인은 요구하지 + 않고, stop condition이 발생할 때만 멈춘다. + +## 배경 + +현재 제품의 공개 기본 경로는 local checkout을 대상으로 한 Gitleaks-first Secret +Detection이다. 이 경로는 `Finding`, local store, report/gate, synthetic evaluation, +redacted metadata 기반 verifier에 맞춰 안정화되어 있다. + +GHAS-like vulnerability scanning 리서치의 결론은 기존 secret 경로를 일반화하거나 +대체하지 않고, 별도 opt-in category로 SARIF-native SAST 제품 slice를 추가하는 것이다. +이번 long-single-goal은 broad security platform이 아니라 다음 네 milestone을 한 번에 +닫는다. + +- M1: SARIF import -> `VULN_FINDING` JSONL artifact. +- M2: `report` / `gate` / `evaluate`의 `code-vuln` opt-in 지원. +- M3: Semgrep-compatible analyzer adapter가 SARIF를 만들고 canonical importer를 재사용. +- M4: redacted metadata 기반 vulnerability LLM verifier/explainer. + +GHAS는 이번 scope의 실행 주체가 아니다. Scan trigger, SARIF upload, alert mutation, +live fetch는 금지한다. SCA/SBOM/dependency vulnerability도 별도 future track으로 둔다. + +## 제품 경계 + +포함한다. + +- Synthetic SARIF 2.1.0 fixture ingestion. +- 별도 `VULN_FINDING` / `VulnerabilityFinding` domain model. +- Deterministic JSONL artifact writer/reader. +- Opt-in `import-sarif` CLI. +- Opt-in `code-vuln` report/gate/evaluate. +- Semgrep CE-compatible CLI 기본 adapter. +- Vulnerability verifier/explainer with strict JSON/fail-closed behavior. +- Architecture review gate: pre-implementation, post-M2, post-M3, final. + +포함하지 않는다. + +- 기존 secret `Finding` 대형 일반화 또는 migration. +- Gitleaks primary scanner 교체. +- Existing secret scan/report/gate/evaluate default behavior 변경. +- GHAS live fetch, workflow dispatch, SARIF upload, alert dismissal, mutation. +- Real SARIF/GHAS export/finding fixture. +- Raw source snippet 기반 LLM detector 또는 code patch generator. +- SCA/SBOM/dependency vulnerability scanner. +- DynamoDB-compatible projection 또는 storage schema migration. + +## 기능 요구사항 + +- FR-1: 현재 `scan`, `report`, `gate`, `evaluate`의 Secret Detection 기본 동작을 변경하지 않는다. +- FR-2: Phase 2a는 명시 opt-in `code-vuln` category로만 노출한다. +- FR-3: 기존 secret `Finding` 모델은 Phase 1 entity로 유지하고, code vulnerability alert에는 별도 `VulnerabilityFinding` entity를 둔다. +- FR-4: Phase 2a의 primary interchange contract는 SARIF-compatible model이다. +- FR-5: M1은 synthetic SARIF fixture 또는 SARIF file을 deterministic `VULN_FINDING` JSONL artifact로 정규화한다. +- FR-6: `VULN_FINDING` artifact는 line-oriented, schema-versioned JSONL이어야 한다. +- FR-7: M1 persistence는 JSONL artifact-first로 고정한다. DynamoDB-compatible projection은 이번 goal 밖이다. +- FR-8: `finding_id`는 SARIF `partialFingerprints`를 우선하고, 없으면 `source_tool + rule_id + normalized synthetic path + start_line + message` stable hash fallback을 사용한다. +- FR-9: Committed fixture path는 synthetic-only로 유지한다. Private proof가 필요하면 hashed/redacted path만 허용한다. +- FR-9a: SARIF import는 `synthetic` path policy와 `redacted` path policy를 구분한다. `synthetic`은 public fixture 상대 경로를 보존하고, `redacted`는 상대 경로까지 hash한다. +- FR-9b: `scan-vuln`은 실제 local checkout을 대상으로 하므로 기본 path policy가 `redacted`여야 한다. `import-sarif`는 committed synthetic fixture를 위해 기본 `synthetic`을 유지할 수 있다. +- FR-10: M2는 `report`, `gate`, `evaluate`에 `--category code-vuln` 또는 equivalent opt-in을 추가한다. +- FR-11: `code-vuln` gate는 severity + precision threshold를 사용하고, secret `gate --max 0` semantics와 섞지 않는다. +- FR-12: `code-vuln` evaluation은 synthetic vulnerable corpus와 expected finding contract로 검증한다. +- FR-13: M3는 Semgrep CE-compatible CLI command를 기본 engine target으로 둔다. +- FR-14: M3 adapter boundary는 Semgrep-compatible으로 유지해 OpenGrep 교체 가능성을 남긴다. +- FR-15: Analyzer output은 canonical SARIF importer를 통과해야 하며, analyzer-specific JSON은 internal contract가 될 수 없다. +- FR-15a: Analyzer subprocess failure detail은 adapter seam에서 public-safe하게 sanitize하며 raw stdout/stderr를 CLI/report에 그대로 echo하지 않는다. +- FR-16: M4 LLM은 detector가 아니라 verifier/explainer/generic remediation assistant다. +- FR-17: M4 LLM 입력은 rule id/name, CWE/OWASP tags, analyzer severity/precision, redacted location shape, trace cardinality, rule help summary로 제한한다. +- FR-18: M4 LLM output은 strict JSON, confidence threshold, fail-closed `NEEDS_REVIEW` behavior를 따른다. +- FR-19: LLM 결과는 finding 삭제, suppression, terminal dismissal, code patch의 단독 근거가 될 수 없다. +- FR-20: GHAS code scanning comparison은 future track이며 이번 goal에서 fetch, trigger, upload, mutation을 하지 않는다. +- FR-21: SCA/SBOM/dependency vulnerability는 future track으로 둔다. +- FR-22: Architecture review gate는 필수 blocking check다. Pre-implementation, post-M2, post-M3, final 단계에서 실행한다. +- FR-23: Architecture review가 SoT 변경, scope expansion, unsafe data handling, existing secret behavior 변경을 요구하면 stop condition이다. +- FR-24: Multi-agent execution을 허용하고 권장한다. 구현 worker는 repo policy의 subagent model 기준을 따른다. + +## 비기능 요구사항 + +| 항목 | 요구값 | +| --- | --- | +| Public repo safety | 실제 조직명, host, repository name, private path, raw source snippet, real GHAS alert/export, secret, credential을 기록하지 않는다. | +| Scope preservation | README의 현재 지원 범위는 Phase 1 Secret Detection으로 유지한다. Phase 2a는 opt-in extension으로만 설명한다. | +| Local-first | 모든 구현은 local checkout과 synthetic fixture에서 재현 가능해야 한다. | +| Reproducibility | 같은 SARIF fixture는 같은 normalized vulnerability finding, report, gate, evaluation 결과를 만들어야 한다. | +| Fail-closed LLM | invalid output, timeout, transport error, low confidence는 review-needed 상태로 남긴다. | +| Analyzer replaceability | Analyzer output은 SARIF-compatible boundary를 통과해야 하며, analyzer-specific fields는 optional metadata로 격리한다. | +| Testability | M1~M4 각각 failing test를 먼저 둘 수 있는 module seam을 유지한다. | +| YAGNI | GHAS lifecycle automation, campaign management, cloud rollout, managed storage migration은 포함하지 않는다. | + +## 수용 기준 + +- AC-1: `requirements.md`, `requirements.html`, `design.md`, `review.md`가 같은 spec 디렉터리에 존재한다. +- AC-2: 단일-goal workflow가 M1~M4 product-complete scope, allowed write surface, validation commands, stop conditions, resume prompt를 포함한다. +- AC-3: `governance/autopilot_goal.yml`, `governance/current.yml`, `CURRENT.md`의 active goal이 `phase-2a-sarif-product-complete`로 일치한다. +- AC-4: Architecture review gate가 pre-implementation, post-M2, post-M3, final 필수 check로 명시된다. +- AC-5: 문서가 기존 Gitleaks-first Secret Detection 기본 경로와 `Finding` 모델을 유지한다고 명시한다. +- AC-6: 문서가 Phase 2a를 SARIF-first, opt-in `code-vuln`, 별도 `VULN_FINDING`/`VulnerabilityFinding` 경계로 정의한다. +- AC-7: 문서가 Semgrep CE-compatible first engine target과 Semgrep-compatible adapter boundary를 분리한다. +- AC-8: 문서가 GHAS scan trigger, upload, alert mutation, live fetch를 금지한다. +- AC-9: 문서가 LLM을 detector가 아니라 verifier/explainer/generic remediation assistant로 제한한다. +- AC-10: 문서가 SCA/SBOM/dependency vulnerability를 future track으로 제외한다. +- AC-11: Public-safety check에서 real secret, private repo/path, internal endpoint, real GHAS alert/export로 보이는 내용이 발견되지 않는다. +- AC-12: Architecture gate evidence가 pre-implementation, post-M2, post-M3, final 단계별로 남는다. +- AC-13: Governance validation, full tests, public-safety, autopilot gate가 통과한다. + +## Rollback criteria + +- RB-1: README 또는 public docs가 Phase 2a를 default runtime feature처럼 설명하면 되돌린다. +- RB-2: 기존 `scan` default behavior나 secret report/gate semantics 변경을 요구하면 되돌린다. +- RB-3: 기존 `Finding`을 code vulnerability까지 포괄하도록 즉시 일반화하려 하면 되돌린다. +- RB-4: Analyzer-specific JSON이 internal canonical contract가 되면 되돌린다. +- RB-5: GHAS scan execution, SARIF upload, alert state mutation, security campaign automation을 포함하면 되돌린다. +- RB-6: LLM이 raw snippet을 받거나 detector/auto-dismissal owner가 되면 되돌린다. +- RB-7: SCA/SBOM implementation이 M1~M4 acceptance criteria에 섞이면 되돌린다. +- RB-8: Architecture review gate를 생략하거나 non-blocking advisory로 낮추면 되돌린다. + +## 만들지 않을 것 + +- Existing secret `Finding` migration. +- Gitleaks default scanner 교체. +- GHAS live fetch, code scanning SARIF upload, alert mutation, dismissal, campaign automation. +- Real GHAS alert/export fixture, real source snippet, real repository path, real private finding sample. +- Raw-code LLM detector, public-safe mode의 code patch generator. +- SCA/SBOM/dependency vulnerability implementation. +- DynamoDB-compatible projection 또는 managed storage schema migration. +- Public docs claiming SAST is supported by default today. + +## 사용자 시나리오 + +- Maintainer가 기존 Secret Detection default를 유지하면서 opt-in `code-vuln` SAST 결과를 별도로 가져온다. +- Implementer가 analyzer 선택 논쟁 전에 SARIF-compatible fixture와 normalized finding contract를 먼저 만든다. +- Reviewer가 report/gate/evaluate category 확장이 secret gate 의미를 바꾸지 않았는지 확인한다. +- Reviewer가 Semgrep-compatible adapter가 canonical SARIF importer를 우회하지 않았는지 확인한다. +- Security reviewer가 LLM prompt/output에 raw snippet, private path, real repo name이 남지 않았는지 확인한다. +- Architecture reviewer가 pre-implementation, post-M2, post-M3, final 단계에서 module seam과 product boundary를 blockable gate로 점검한다. + +## 고정 결정 + +- First engine target: Semgrep CE-compatible CLI `semgrep`. +- Adapter abstraction: Semgrep-compatible. +- Persistence: JSONL artifact-first. +- Gate semantics: severity + precision threshold. +- Location policy: committed fixture는 synthetic-only, private proof는 hashed/redacted only. +- Human gate: stop-conditions-only. + +## 후속 결정 + +- GHAS code scanning alert read-only comparison을 Phase 2b로 둘지 별도 GHAS comparison track으로 둘지. +- SCA/SBOM/dependency vulnerability scanner를 언제 별도 product line으로 시작할지. diff --git a/docs/workbench/specs/phase-2a-sarif-native-sast/review.md b/docs/workbench/specs/phase-2a-sarif-native-sast/review.md new file mode 100644 index 0000000..ed4f007 --- /dev/null +++ b/docs/workbench/specs/phase-2a-sarif-native-sast/review.md @@ -0,0 +1,178 @@ +# Phase 2a SARIF-native SAST Multi-agent Review + +## 범위 + +리뷰 대상: + +- `docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md` +- `docs/workbench/specs/phase-2a-sarif-native-sast/design.md` +- `docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md` +- `docs/views/research-and-technical-decisions.md` +- `governance/autopilot_goal.yml` + +리뷰 방식: + +- `system_architecture_manager`: system boundary, GHAS/LLM/rollback risk. +- `codebase_architecture_manager`: current code seams, implementation order. +- `autopilot readiness auditor`: governance goal, write scope, checks, stop conditions. + +모든 subagent 리뷰는 read-only로 수행한다. 구현은 long-single-goal 실행 agent가 맡는다. + +## Current verdict + +M1 packet은 폐기하지 않는다. 기존 M1 import-first packet을 +`phase-2a-sarif-product-complete` long-single-goal로 승격한다. + +M1-only 권고는 여전히 리스크 분석으로 유효하지만, 사용자의 실행 의도는 M1~M4 제품완성이다. +따라서 승인 모델은 다음처럼 조정한다. + +- 중간 milestone 사용자 승인은 요구하지 않는다. +- Architecture review gate를 mandatory blocking check로 둔다. +- Pre-implementation, post-M2, post-M3, final review에서 blocker가 나오면 같은 goal 안에서 + 고치고, SoT/scope/safety change가 필요할 때만 stop condition으로 멈춘다. + +## Findings + +### High: M1 persistence boundary must stay artifact-first + +초기 spec은 `JSONL artifact-first`를 권장했지만 `VULN_FINDING`을 JSONL artifact-only로 +시작할지 DynamoDB-compatible projection까지 갈지 확정하지 않았다. 현재 storage와 +`Finding` path는 secret-specific invariant가 많으므로 store-first는 blast radius가 크다. + +반영: + +- M1 persistence를 `VULN_FINDING` JSONL artifact-first로 고정했다. +- DynamoDB-compatible projection은 이번 M1~M4 goal 밖으로 이동했다. +- `VulnerabilityFindingJsonlStore` seam을 design에 추가했다. + +### High: Broad Phase 2a single-goal requires architecture review gates + +`report/gate/evaluate --category code-vuln`, Semgrep-compatible adapter, verifier, +synthetic corpus를 한 번에 묶으면 현재 CLI/core seam보다 커진다. 원래 review는 M1-only를 +권고했다. + +반영: + +- 사용자의 explicit instruction에 따라 M1~M4를 하나의 long-single-goal로 승격했다. +- Architecture review gate를 pre-implementation, post-M2, post-M3, final 단계의 필수 + blocking check로 박았다. +- Blocking architecture finding은 user approval gate가 아니라 same-goal fix 또는 + stop condition으로 처리한다. +- `docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md` + 파일명은 유지하고 내용만 product-complete packet으로 승격했다. + +### High: GHAS code scanning boundary needs stronger guardrails + +기존 GHAS path는 secret scanning comparison 중심이다. Code scanning alert 비교는 별도 +endpoint/permission/raw response risk가 있으므로 같은 `GHAS_ALERT`에 섞으면 안 된다. + +반영: + +- 이번 goal에서 GHAS live fetch, workflow dispatch, SARIF upload, alert dismissal, + mutation을 모두 금지했다. +- 향후 `GHAS_CODE_ALERT`는 별도 future track에서만 설계한다. + +### Medium: Finding identity needs canonical rule + +Spec은 deterministic `finding_id`를 요구했지만 SARIF field 우선순위와 fallback rule이 +없었다. + +반영: + +- `finding_id` 우선순위를 SARIF `partialFingerprints` 우선, 없으면 + `source_tool + rule_id + normalized synthetic path + start_line + message` fallback hash로 + 고정했다. +- Committed fixture는 synthetic-only path, private proof는 hashed/redacted path만 허용했다. + +### Medium: Semgrep-compatible adapter must not become the internal contract + +Semgrep CE-compatible CLI를 M3 default engine target으로 고정했지만, adapter output이 +engine-specific JSON에 묶이면 SARIF-first 설계가 깨진다. + +반영: + +- Adapter는 SARIF output을 만들고 canonical `SarifImporter`를 재사용해야 한다. +- Fake command runner tests를 요구해 engine 설치 없이 command construction/failure를 검증한다. +- OpenGrep 교체 가능성을 위해 adapter boundary는 Semgrep-compatible으로 유지한다. + +### Medium: Vulnerability verifier must not reuse secret prompt directly + +현재 verifier prompt와 adapter는 secret-specific `Finding`에 묶여 있다. Strict JSON, +confidence, fail-closed behavior는 재사용 가능하지만 prompt/application은 별도여야 한다. + +반영: + +- `VulnerabilityVerifier`는 별도 prompt/application adapter를 요구하도록 design에 추가했다. +- 기존 secret verifier의 shared behavior는 parser/result 처리에 한정했다. +- Raw source snippet, private path, real repo name leakage tests를 M4 acceptance에 포함했다. + +### Low: Research decision note could be misread as global filter policy + +`research-and-technical-decisions.md`의 noise filter 결정은 Gitleaks parser 전용인데, +Phase 2a 추가 후 전체 scanner policy처럼 읽힐 수 있다. + +반영: + +- 해당 섹션 제목을 `Secret Detection 노이즈 필터 위치 결정`으로 좁히고 Phase 2a는 별도 + `VULN_FINDING` contract에서 결정한다고 명시했다. + +## Fixed decisions for execution + +| 결정 | 값 | 이유 | +| --- | --- | --- | +| Goal ID | `phase-2a-sarif-product-complete` | M1 packet을 제품완성 long goal로 승격한다. | +| Scope | M1~M4 | Import, report/gate/evaluate, Semgrep-compatible adapter, verifier/explainer까지 countable product slice다. | +| Persistence | JSONL artifact-first | 현재 storage는 secret-specific invariant가 많다. | +| Path handling | committed fixture import는 synthetic-only, `scan-vuln` 실제 checkout scan은 기본 hashed/redacted only | Public repo safety와 재현성을 동시에 지킨다. | +| Gate policy | severity + precision threshold | Secret `gate --max 0` 의미와 섞지 않는다. | +| First engine target | Semgrep CE-compatible CLI `semgrep` | Local-first SAST adapter에 가장 실용적인 기본값이다. | +| Adapter boundary | Semgrep-compatible + SARIF importer | OpenGrep 교체 가능성과 SARIF-first contract를 유지한다. | +| Human gate | stop-conditions-only | 중간 승인 gate 없이 장기 실행한다. | +| Architecture gate | mandatory/blocking | 제품완성 scope의 blast radius를 통제한다. | + +## Architecture review gate checklist + +Pre-implementation: + +- Spec/design/workflow가 M1~M4 scope와 allowed write surface를 일치시킨다. +- Existing secret default behavior를 바꾸는 요구가 없다. +- GHAS/SCA/raw snippet boundary가 명확하다. + +Post-M2: + +- `VulnerabilityFinding`이 secret `Finding`과 분리되어 있다. +- JSONL artifact가 canonical code-vuln persistence다. +- Report/gate/evaluate opt-in이 secret default를 깨지 않는다. + +Post-M3: + +- Semgrep-compatible adapter output은 SARIF다. +- Canonical `SarifImporter`를 우회하지 않는다. +- Analyzer-specific JSON이 internal contract가 아니다. + +Final: + +- M1~M4 tests, governance checks, public-safety checks가 통과한다. +- Architecture blocker가 남아 있지 않다. +- PR summary에 architecture gate evidence가 있다. + +## Stop conditions + +다음 경우에만 사람에게 멈춰서 묻는다. + +- Public-safety hit이 committed content redaction 없이는 해결되지 않는다. +- Allowed write surface 밖 변경이 필요하다. +- Architecture review가 SoT change 또는 scope expansion을 요구한다. +- GHAS trigger/fetch/upload/mutation이 필요해진다. +- Raw source snippet, private path, real SARIF/GHAS export persistence가 필요해진다. +- Existing secret workflow default behavior가 바뀐다. +- New third-party Python dependency가 필요해진다. +- Storage projection/schema migration이 필요해진다. +- Protected branch/PR permission failure가 난다. +- 같은 blocker가 세 번 반복된다. + +## Autopilot readiness verdict + +준비됨. 현재 packet은 M1 import-first를 폐기하지 않고 M1~M4 제품완성 scope로 승격되었다. +다음 구현 세션은 이 문서와 workflow를 source of truth로 삼아 pre-implementation +architecture review gate부터 시작하고, 중간 사용자 승인 없이 product-complete PR까지 진행한다. diff --git a/eval/synthetic-code-vuln/expected-findings.example.json b/eval/synthetic-code-vuln/expected-findings.example.json new file mode 100644 index 0000000..b0abe01 --- /dev/null +++ b/eval/synthetic-code-vuln/expected-findings.example.json @@ -0,0 +1,11 @@ +{ + "schemaVersion": 1, + "name": "synthetic-code-vuln-corpus", + "expectedFindings": [ + { + "filePath": "synthetic_app/handlers.py", + "lineStart": 42, + "ruleId": "python.lang.security.audit.sql-injection" + } + ] +} diff --git a/governance/autopilot_goal.yml b/governance/autopilot_goal.yml index 2043f5e..a99ff7d 100644 --- a/governance/autopilot_goal.yml +++ b/governance/autopilot_goal.yml @@ -1,5 +1,5 @@ schema_version: 1 -goal_id: review-assisted-autopilot +goal_id: phase-2a-sarif-product-complete execution_mode: style: long-single-goal human_gate: stop-conditions-only @@ -15,25 +15,40 @@ policy_decisions: fork_prs: blocked-or-skipped-before-secrets public_artifacts: synthetic-or-redacted-only allowed_writes: - - .codex/specs/2026-06-17-autopilot-review-assisted/** - - .github/workflows/** - - .pr_agent.toml + - docs/workbench/specs/phase-2a-sarif-native-sast/** + - docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md + - docs/views/research-and-technical-decisions.md + - src/security_scanner/** + - tests/** + - examples/** + - eval/** - docs/workbench/** - governance/** - ledger/** - CURRENT.md - - tests/** acceptance_checks: + - architecture-review gate: pre-implementation + - architecture-review gate: post-M2 + - architecture-review gate: post-M3 + - architecture-review gate: final - uv run pytest - uv run python -m governance.render --validate - uv run python -m governance.render --check - uv run python -m governance.rebuild_ledger_index --check - uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check - uv run python -m governance.public_safety --diff origin/main...HEAD + - uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md - uv run python -m governance.autopilot_gate --base origin/main stop_conditions: - public-safety-hit - scope-expansion + - architecture-review-blocking-finding + - architecture-review-sot-change + - new-third-party-python-dependency-required + - ghas-live-fetch-or-mutation-required + - private-path-or-raw-snippet-required + - storage-projection-or-schema-migration-required + - existing-secret-default-behavior-change - direct-protected-main-push-required - pr-agent-secret-boundary-ambiguity - github-settings-permission-failure diff --git a/governance/current.yml b/governance/current.yml index bb0647f..b06ca03 100644 --- a/governance/current.yml +++ b/governance/current.yml @@ -37,7 +37,7 @@ gates: proof_ref: '' proof_hash: '' autopilot: - active_goal: review-assisted-autopilot + active_goal: phase-2a-sarif-product-complete merge_mode: guarded-auto-merge last_auto_merge: ledger:20260617T003405Z-autopilot-3236f4 open_decisions: [] diff --git a/src/security_scanner/cli/app.py b/src/security_scanner/cli/app.py index 24eb15e..5f0687f 100644 --- a/src/security_scanner/cli/app.py +++ b/src/security_scanner/cli/app.py @@ -22,10 +22,12 @@ storage, targets, verify, + vulnerability, ) _COMMAND_MODULES = ( scan, + vulnerability, scan_health, report, verify, diff --git a/src/security_scanner/cli/commands/report.py b/src/security_scanner/cli/commands/report.py index 79ffb89..36529b6 100644 --- a/src/security_scanner/cli/commands/report.py +++ b/src/security_scanner/cli/commands/report.py @@ -26,7 +26,18 @@ ) from security_scanner.core.policy.gate import GateThresholds, evaluate_gate from security_scanner.core.report.generator import render_report +from security_scanner.core.vulnerability import ( + VulnerabilityEvaluationThresholds, + VulnerabilityGateThresholds, + evaluate_vulnerability_findings, + evaluate_vulnerability_gate, + evaluate_vulnerability_gate_policy, + load_vulnerability_expected_findings, + render_vulnerability_evaluation_report, + render_vulnerability_report, +) from security_scanner.storage.jsonl_store import JsonlFindingStore +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore def register(subparsers) -> None: @@ -36,6 +47,7 @@ def register(subparsers) -> None: ) add_storage_args(report_parser, include_jsonl_path="findings") add_query_args(report_parser) + _add_category_arg(report_parser) report_parser.set_defaults(func=cmd_report) gate_parser = subparsers.add_parser( @@ -44,6 +56,7 @@ def register(subparsers) -> None: ) add_storage_args(gate_parser, include_jsonl_path="findings") add_query_args(gate_parser) + _add_category_arg(gate_parser) gate_parser.add_argument( "--max", type=int, @@ -51,6 +64,18 @@ def register(subparsers) -> None: metavar="N", help="Maximum blocking findings tolerated before failing (default: 0).", ) + gate_parser.add_argument( + "--severity-min", + default="HIGH", + metavar="LEVEL", + help="Minimum code-vuln severity that can block (default: HIGH).", + ) + gate_parser.add_argument( + "--precision-min", + default="HIGH", + metavar="LEVEL", + help="Minimum code-vuln precision that can block (default: HIGH).", + ) gate_parser.set_defaults(func=cmd_gate) evaluate_parser = subparsers.add_parser( @@ -59,6 +84,7 @@ def register(subparsers) -> None: ) add_storage_args(evaluate_parser, include_jsonl_path="findings") add_query_args(evaluate_parser) + _add_category_arg(evaluate_parser) evaluate_parser.add_argument( "--expected", metavar="FILE", @@ -117,21 +143,65 @@ def register(subparsers) -> None: def cmd_report(args: argparse.Namespace) -> int: """Render a human-readable report from a findings store.""" - findings = read_findings_from_args(args) - print(render_report(findings), end="") - return 0 + try: + if _is_vulnerability_category(args): + findings = _read_vulnerability_findings_from_args(args) + print(render_vulnerability_report(findings), end="") + return 0 + findings = read_findings_from_args(args) + print(render_report(findings), end="") + return 0 + except ValueError as exc: + print(f"error: report failed: {exc}", file=sys.stderr) + return 2 def cmd_gate(args: argparse.Namespace) -> int: """Exit non-zero when blocking findings exceed the threshold.""" - findings = read_findings_from_args(args) - result = evaluate_gate(findings, GateThresholds(max_blocking=args.max)) - print(result.reason) - return 0 if result.passed else 1 + try: + if _is_vulnerability_category(args): + findings = _read_vulnerability_findings_from_args(args) + result = evaluate_vulnerability_gate_policy( + findings, + VulnerabilityGateThresholds( + max_blocking=args.max, + severity_min=args.severity_min, + precision_min=args.precision_min, + ), + ) + print(result.reason) + return 0 if result.passed else 1 + findings = read_findings_from_args(args) + result = evaluate_gate(findings, GateThresholds(max_blocking=args.max)) + print(result.reason) + return 0 if result.passed else 1 + except ValueError as exc: + print(f"error: gate failed: {exc}", file=sys.stderr) + return 2 def cmd_evaluate(args: argparse.Namespace) -> int: """Evaluate actual findings against a synthetic expected-results corpus.""" + try: + if _is_vulnerability_category(args): + expected = load_vulnerability_expected_findings(args.expected) + findings = _read_vulnerability_findings_from_args(args) + result = evaluate_vulnerability_findings(expected, findings) + gate = evaluate_vulnerability_gate( + result, + VulnerabilityEvaluationThresholds( + false_negative_max=args.false_negative_max, + precision_min=args.precision_min, + recall_min=args.recall_min, + ), + ) + print(render_vulnerability_evaluation_report(result), end="") + print(gate.reason) + return 0 if gate.passed else 1 + except ValueError as exc: + print(f"error: evaluate failed: {exc}", file=sys.stderr) + return 2 + corpus = load_evaluation_corpus(args.expected) findings = read_findings_from_args(args) thresholds = EvaluationThresholds( @@ -200,3 +270,22 @@ def cmd_compare_ghas(args: argparse.Namespace) -> int: print(render_ghas_comparison_report(result), end="") return 0 + + +def _add_category_arg(parser: argparse.ArgumentParser) -> None: + parser.add_argument( + "--category", + choices=["secret", "code-vuln"], + default="secret", + help="Finding category to consume (default: secret).", + ) + + +def _is_vulnerability_category(args: argparse.Namespace) -> bool: + return getattr(args, "category", "secret") == "code-vuln" + + +def _read_vulnerability_findings_from_args(args: argparse.Namespace): + if args.storage_backend != "jsonl": + raise ValueError("code-vuln category supports JSONL artifacts only") + return VulnerabilityJsonlStore(args.findings).read_all() diff --git a/src/security_scanner/cli/commands/verify.py b/src/security_scanner/cli/commands/verify.py index 7da8ceb..443fed7 100644 --- a/src/security_scanner/cli/commands/verify.py +++ b/src/security_scanner/cli/commands/verify.py @@ -11,6 +11,10 @@ resolve_verifier_config, run_verify_artifact, ) +from security_scanner.runtime.vulnerability_verify_artifact import ( + VerifyVulnerabilityArtifactRequest, + run_verify_vulnerability_artifact, +) def register(subparsers) -> None: @@ -24,6 +28,12 @@ def register(subparsers) -> None: default="findings.jsonl", help="Input JSONL findings file.", ) + verify_parser.add_argument( + "--category", + choices=["secret", "code-vuln"], + default="secret", + help="Finding category to verify (default: secret).", + ) verify_parser.add_argument( "--output", metavar="FILE", @@ -77,16 +87,37 @@ def cmd_verify(args: argparse.Namespace) -> int: min_confidence=args.min_confidence, ) ) - result = run_verify_artifact( - VerifyArtifactRequest( + result = _run_verifier_for_category(args, config) + except ValueError as exc: + print(f"error: {exc}", file=sys.stderr) + return 2 + + print(_verification_message(args, result)) + return 0 + + +def _run_verifier_for_category(args: argparse.Namespace, config): + if args.category == "code-vuln": + return run_verify_vulnerability_artifact( + VerifyVulnerabilityArtifactRequest( findings_path=args.findings, output_path=args.output, config=config, ) ) - except ValueError as exc: - print(f"error: {exc}", file=sys.stderr) - return 2 + return run_verify_artifact( + VerifyArtifactRequest( + findings_path=args.findings, + output_path=args.output, + config=config, + ) + ) - print(f"Verified {result.finding_count} finding(s) -> {result.output_path}") - return 0 + +def _verification_message(args: argparse.Namespace, result) -> str: + if args.category == "code-vuln": + return ( + f"Verified {result.finding_count} code-vuln finding(s) -> " + f"{result.output_path}" + ) + return f"Verified {result.finding_count} finding(s) -> {result.output_path}" diff --git a/src/security_scanner/cli/commands/vulnerability.py b/src/security_scanner/cli/commands/vulnerability.py new file mode 100644 index 0000000..5088858 --- /dev/null +++ b/src/security_scanner/cli/commands/vulnerability.py @@ -0,0 +1,139 @@ +"""SARIF-native code vulnerability subcommands.""" + +from __future__ import annotations + +import argparse +import sys + +from security_scanner.runtime.vulnerability_scan import ( + ImportSarifRequest, + ScanVulnerabilityRequest, + run_import_sarif, + run_vulnerability_scan, +) +from security_scanner.scanners.semgrep_compatible import SemgrepExecutionError + + +def register(subparsers) -> None: + import_parser = subparsers.add_parser( + "import-sarif", + help="Normalize a SARIF file into a VULN_FINDING JSONL artifact.", + ) + import_parser.add_argument( + "--sarif", + metavar="FILE", + required=True, + help="Input SARIF 2.1.0 file.", + ) + import_parser.add_argument( + "-o", + "--output", + metavar="FILE", + default="vuln-findings.jsonl", + help="Output VULN_FINDING JSONL artifact.", + ) + import_parser.add_argument( + "--path-policy", + choices=["synthetic", "redacted"], + default="synthetic", + help=( + "SARIF path handling: preserve synthetic relative paths or hash all " + "paths for private proof (default: synthetic)." + ), + ) + import_parser.set_defaults(func=cmd_import_sarif) + + scan_parser = subparsers.add_parser( + "scan-vuln", + help="Run a Semgrep-compatible SAST scan and normalize SARIF output.", + ) + scan_parser.add_argument( + "--root", + metavar="DIR", + default=".", + help="Local checkout root to scan (default: current directory).", + ) + scan_parser.add_argument( + "-o", + "--output", + metavar="FILE", + default="vuln-findings.jsonl", + help="Output VULN_FINDING JSONL artifact.", + ) + scan_parser.add_argument( + "--sarif-output", + metavar="FILE", + default=None, + help="Optional SARIF output path to keep after scan.", + ) + scan_parser.add_argument( + "--semgrep-binary", + metavar="BIN", + default="semgrep", + help="Semgrep-compatible binary (default: semgrep).", + ) + scan_parser.add_argument( + "--semgrep-config", + metavar="CONFIG", + default="auto", + help="Semgrep config value (default: auto).", + ) + scan_parser.add_argument( + "--timeout-seconds", + type=int, + default=300, + metavar="N", + help="Semgrep-compatible subprocess timeout in seconds.", + ) + scan_parser.add_argument( + "--path-policy", + choices=["synthetic", "redacted"], + default="redacted", + help=( + "SARIF path handling: preserve synthetic relative paths or hash all " + "paths for private proof (default: redacted)." + ), + ) + scan_parser.set_defaults(func=cmd_scan_vuln) + + +def cmd_import_sarif(args: argparse.Namespace) -> int: + try: + result = run_import_sarif( + ImportSarifRequest( + sarif_path=args.sarif, + output_path=args.output, + path_policy=args.path_policy, + ) + ) + except ValueError as exc: + print(f"error: import-sarif failed: {exc}", file=sys.stderr) + return 2 + print( + f"Imported {result.finding_count} code-vuln finding(s) -> " + f"{result.output_path}" + ) + return 0 + + +def cmd_scan_vuln(args: argparse.Namespace) -> int: + try: + result = run_vulnerability_scan( + ScanVulnerabilityRequest( + root=args.root, + output_path=args.output, + sarif_output_path=args.sarif_output, + semgrep_binary=args.semgrep_binary, + semgrep_config=args.semgrep_config, + timeout_seconds=args.timeout_seconds, + path_policy=args.path_policy, + ) + ) + except (SemgrepExecutionError, ValueError) as exc: + print(f"error: scan-vuln failed: {exc}", file=sys.stderr) + return 2 + print( + f"Scanned {args.root}: imported {result.finding_count} " + f"code-vuln finding(s) -> {result.output_path}" + ) + return 0 diff --git a/src/security_scanner/core/vulnerability/__init__.py b/src/security_scanner/core/vulnerability/__init__.py new file mode 100644 index 0000000..d71e2aa --- /dev/null +++ b/src/security_scanner/core/vulnerability/__init__.py @@ -0,0 +1,55 @@ +"""Code vulnerability finding model and SARIF-first helpers.""" + +from security_scanner.core.vulnerability.evaluation import ( + VulnerabilityEvaluationKey, + VulnerabilityEvaluationResult, + VulnerabilityEvaluationThresholds, + VulnerabilityExpectedFinding, + evaluate_vulnerability_findings, + evaluate_vulnerability_gate, + load_vulnerability_expected_findings, + render_vulnerability_evaluation_report, +) +from security_scanner.core.vulnerability.gate import ( + VulnerabilityGateResult, + VulnerabilityGateThresholds, + evaluate_vulnerability_gate_policy, +) +from security_scanner.core.vulnerability.model import ( + VULN_CATEGORY, + VULN_ENTITY_TYPE, + VULN_SCHEMA_VERSION, + VulnerabilityFinding, + VulnerabilityLocation, + compute_vulnerability_finding_id, +) +from security_scanner.core.vulnerability.report import render_vulnerability_report +from security_scanner.core.vulnerability.sarif import ( + SarifImportError, + import_sarif_file, + import_sarif_payload, +) + +__all__ = [ + "VULN_CATEGORY", + "VULN_ENTITY_TYPE", + "VULN_SCHEMA_VERSION", + "SarifImportError", + "VulnerabilityEvaluationKey", + "VulnerabilityEvaluationResult", + "VulnerabilityEvaluationThresholds", + "VulnerabilityExpectedFinding", + "VulnerabilityFinding", + "VulnerabilityGateResult", + "VulnerabilityGateThresholds", + "VulnerabilityLocation", + "compute_vulnerability_finding_id", + "evaluate_vulnerability_findings", + "evaluate_vulnerability_gate", + "evaluate_vulnerability_gate_policy", + "import_sarif_file", + "import_sarif_payload", + "load_vulnerability_expected_findings", + "render_vulnerability_evaluation_report", + "render_vulnerability_report", +] diff --git a/src/security_scanner/core/vulnerability/evaluation.py b/src/security_scanner/core/vulnerability/evaluation.py new file mode 100644 index 0000000..e171982 --- /dev/null +++ b/src/security_scanner/core/vulnerability/evaluation.py @@ -0,0 +1,206 @@ +"""Synthetic corpus evaluation for code vulnerability findings.""" + +from __future__ import annotations + +import json +from collections import Counter +from collections.abc import Iterable +from dataclasses import dataclass +from pathlib import Path + +from security_scanner.core.vulnerability.model import VulnerabilityFinding + + +@dataclass(frozen=True, order=True) +class VulnerabilityEvaluationKey: + file_path: str + line_start: int + rule_id: str + + @classmethod + def from_finding( + cls, + finding: VulnerabilityFinding, + ) -> VulnerabilityEvaluationKey: + return cls( + file_path=finding.primary_location.file_path, + line_start=finding.primary_location.line_start or 0, + rule_id=finding.rule_id, + ) + + def display(self) -> str: + return f"{self.file_path}:{self.line_start} [{self.rule_id}]" + + +@dataclass(frozen=True) +class VulnerabilityExpectedFinding: + file_path: str + line_start: int + rule_id: str + + @property + def key(self) -> VulnerabilityEvaluationKey: + return VulnerabilityEvaluationKey( + file_path=self.file_path, + line_start=self.line_start, + rule_id=self.rule_id, + ) + + @classmethod + def from_dict(cls, data: dict) -> VulnerabilityExpectedFinding: + return cls( + file_path=str(data["filePath"]), + line_start=int(data["lineStart"]), + rule_id=str(data["ruleId"]), + ) + + +@dataclass(frozen=True) +class VulnerabilityEvaluationResult: + expected_count: int + actual_count: int + true_positives: tuple[VulnerabilityEvaluationKey, ...] + false_positives: tuple[VulnerabilityEvaluationKey, ...] + false_negatives: tuple[VulnerabilityEvaluationKey, ...] + + @property + def true_positive_count(self) -> int: + return len(self.true_positives) + + @property + def false_positive_count(self) -> int: + return len(self.false_positives) + + @property + def false_negative_count(self) -> int: + return len(self.false_negatives) + + @property + def precision(self) -> float: + denominator = self.true_positive_count + self.false_positive_count + return 1.0 if denominator == 0 else self.true_positive_count / denominator + + @property + def recall(self) -> float: + denominator = self.true_positive_count + self.false_negative_count + return 1.0 if denominator == 0 else self.true_positive_count / denominator + + +@dataclass(frozen=True) +class VulnerabilityEvaluationThresholds: + false_negative_max: int = 0 + precision_min: float = 0.90 + recall_min: float = 0.99 + + +@dataclass(frozen=True) +class VulnerabilityEvaluationGateResult: + passed: bool + reason: str + + +def load_vulnerability_expected_findings( + path: str | Path, +) -> list[VulnerabilityExpectedFinding]: + data = json.loads(Path(path).read_text(encoding="utf-8")) + schema_version = int(data.get("schemaVersion", 1)) + if schema_version != 1: + raise ValueError( + f"unsupported vulnerability corpus schemaVersion: {schema_version}" + ) + raw = data.get("expectedFindings", []) + if not isinstance(raw, list): + raise ValueError("vulnerability corpus expectedFindings must be a list") + return [VulnerabilityExpectedFinding.from_dict(item) for item in raw] + + +def evaluate_vulnerability_findings( + expected_findings: Iterable[VulnerabilityExpectedFinding], + actual_findings: Iterable[VulnerabilityFinding], +) -> VulnerabilityEvaluationResult: + expected_counter = Counter(item.key for item in expected_findings) + actual_counter = Counter( + VulnerabilityEvaluationKey.from_finding(item) for item in actual_findings + ) + true_positive_counter = expected_counter & actual_counter + false_negative_counter = expected_counter - actual_counter + false_positive_counter = actual_counter - expected_counter + return VulnerabilityEvaluationResult( + expected_count=sum(expected_counter.values()), + actual_count=sum(actual_counter.values()), + true_positives=tuple(_expand_counter(true_positive_counter)), + false_positives=tuple(_expand_counter(false_positive_counter)), + false_negatives=tuple(_expand_counter(false_negative_counter)), + ) + + +def evaluate_vulnerability_gate( + result: VulnerabilityEvaluationResult, + thresholds: VulnerabilityEvaluationThresholds | None = None, +) -> VulnerabilityEvaluationGateResult: + policy = thresholds or VulnerabilityEvaluationThresholds() + failures: list[str] = [] + if result.false_negative_count > policy.false_negative_max: + failures.append( + "false negative count " + f"{result.false_negative_count} > threshold {policy.false_negative_max}" + ) + if result.precision < policy.precision_min: + failures.append( + f"precision {result.precision:.4f} < minimum {policy.precision_min:.4f}" + ) + if result.recall < policy.recall_min: + failures.append(f"recall {result.recall:.4f} < minimum {policy.recall_min:.4f}") + if failures: + return VulnerabilityEvaluationGateResult( + passed=False, + reason="FAIL: " + "; ".join(failures), + ) + return VulnerabilityEvaluationGateResult( + passed=True, + reason=( + "PASS: " + f"false negative count {result.false_negative_count} " + f"<= threshold {policy.false_negative_max}; " + f"precision {result.precision:.4f} >= minimum {policy.precision_min:.4f}; " + f"recall {result.recall:.4f} >= minimum {policy.recall_min:.4f}" + ), + ) + + +def render_vulnerability_evaluation_report( + result: VulnerabilityEvaluationResult, +) -> str: + lines = [ + "Code Vulnerability Evaluation Report", + "====================================", + f"Expected findings: {result.expected_count}", + f"Actual findings: {result.actual_count}", + f"True positives: {result.true_positive_count}", + f"False positives: {result.false_positive_count}", + f"False negatives: {result.false_negative_count}", + f"Precision: {result.precision:.4f}", + f"Recall: {result.recall:.4f}", + ] + _append_key_section(lines, "False positive keys", result.false_positives) + _append_key_section(lines, "False negative keys", result.false_negatives) + return "\n".join(lines).rstrip() + "\n" + + +def _expand_counter(counter: Counter) -> list[VulnerabilityEvaluationKey]: + items: list[VulnerabilityEvaluationKey] = [] + for key, count in sorted(counter.items()): + items.extend([key] * count) + return items + + +def _append_key_section( + lines: list[str], + title: str, + keys: tuple[VulnerabilityEvaluationKey, ...], +) -> None: + if not keys: + return + lines.append(title + ":") + for key in keys: + lines.append(f" - {key.display()}") diff --git a/src/security_scanner/core/vulnerability/gate.py b/src/security_scanner/core/vulnerability/gate.py new file mode 100644 index 0000000..58d7edd --- /dev/null +++ b/src/security_scanner/core/vulnerability/gate.py @@ -0,0 +1,96 @@ +"""Gate policy for code vulnerability findings.""" + +from __future__ import annotations + +from dataclasses import dataclass + +from security_scanner.core.vulnerability.model import VulnerabilityFinding + +_SEVERITY_RANK = { + "INFO": 0, + "LOW": 1, + "MEDIUM": 2, + "HIGH": 3, + "CRITICAL": 4, +} +_PRECISION_RANK = { + "UNKNOWN": 0, + "LOW": 1, + "MEDIUM": 2, + "HIGH": 3, + "VERY_HIGH": 4, +} + + +@dataclass(frozen=True) +class VulnerabilityGateThresholds: + max_blocking: int = 0 + severity_min: str = "HIGH" + precision_min: str = "HIGH" + + +@dataclass(frozen=True) +class VulnerabilityGateResult: + passed: bool + blocking_count: int + threshold: int + reason: str + + +def evaluate_vulnerability_gate_policy( + findings: list[VulnerabilityFinding], + thresholds: VulnerabilityGateThresholds | None = None, +) -> VulnerabilityGateResult: + """Evaluate code-vuln findings using severity + precision thresholds.""" + policy = thresholds or VulnerabilityGateThresholds() + severity_min = _normalize_severity(policy.severity_min) + precision_min = _normalize_precision(policy.precision_min) + blocking = [ + finding + for finding in findings + if _is_blocking( + finding, + severity_min=severity_min, + precision_min=precision_min, + ) + ] + blocking_count = len(blocking) + passed = blocking_count <= policy.max_blocking + comparator = "<=" if passed else ">" + status = "PASS" if passed else "FAIL" + reason = ( + f"{status}: {blocking_count} blocking code-vuln finding(s) " + f"{comparator} threshold {policy.max_blocking} " + f"(severity>={severity_min}, precision>={precision_min})" + ) + return VulnerabilityGateResult( + passed=passed, + blocking_count=blocking_count, + threshold=policy.max_blocking, + reason=reason, + ) + + +def _is_blocking( + finding: VulnerabilityFinding, + *, + severity_min: str, + precision_min: str, +) -> bool: + if finding.triage_state == "FALSE_POSITIVE": + return False + return ( + _SEVERITY_RANK.get(finding.severity, 0) >= _SEVERITY_RANK.get(severity_min, 3) + and _PRECISION_RANK.get(finding.precision, 0) + >= _PRECISION_RANK.get(precision_min, 3) + ) + + +def _normalize_severity(value: str) -> str: + normalized = str(value).strip().replace("-", "_").upper() + return normalized if normalized in _SEVERITY_RANK else "HIGH" + + +def _normalize_precision(value: str) -> str: + normalized = str(value).strip().replace("-", "_").upper() + return normalized if normalized in _PRECISION_RANK else "HIGH" diff --git a/src/security_scanner/core/vulnerability/model.py b/src/security_scanner/core/vulnerability/model.py new file mode 100644 index 0000000..29592ce --- /dev/null +++ b/src/security_scanner/core/vulnerability/model.py @@ -0,0 +1,378 @@ +"""SARIF-native code vulnerability finding model. + +This module is intentionally separate from ``core.finding``. Secret findings +carry secret hashes and Gitleaks payloads; code vulnerability findings carry +SARIF rule/location/trace metadata and never raw source snippets. +""" + +from __future__ import annotations + +import hashlib +import json +import re +from dataclasses import dataclass, field +from pathlib import PurePath +from urllib.parse import urlparse + +from security_scanner.core.vulnerability.redaction import ( + sanitize_partial_fingerprints, + sanitize_vulnerability_identifier, + sanitize_vulnerability_text, + sanitize_vulnerability_uri, +) + +VULN_CATEGORY = "code-vuln" +VULN_ENTITY_TYPE = "VULN_FINDING" +VULN_SCHEMA_VERSION = 1 + +_ABSOLUTE_REDACTED_PREFIX = "absolute-redacted:sha256:" +_RELATIVE_REDACTED_PREFIX = "relative-redacted:sha256:" +_WINDOWS_DRIVE_RE = re.compile(r"^[A-Za-z]:[\\/]") + + +@dataclass(frozen=True) +class VulnerabilityLocation: + """Primary or related SARIF location without raw source text.""" + + file_path: str + line_start: int | None = None + line_end: int | None = None + path_kind: str = "relative" + + def __post_init__(self) -> None: + if self.path_kind == "absolute-redacted" and not self.file_path.startswith( + _ABSOLUTE_REDACTED_PREFIX + ): + object.__setattr__( + self, + "file_path", + _ABSOLUTE_REDACTED_PREFIX + _hash_text(self.file_path), + ) + if self.path_kind == "relative-redacted" and not self.file_path.startswith( + _RELATIVE_REDACTED_PREFIX + ): + object.__setattr__( + self, + "file_path", + _RELATIVE_REDACTED_PREFIX + _hash_text(self.file_path), + ) + + def to_dict(self) -> dict: + return { + "filePath": self.file_path, + "lineStart": self.line_start, + "lineEnd": self.line_end, + "pathKind": self.path_kind, + } + + @classmethod + def from_dict(cls, data: dict) -> VulnerabilityLocation: + return cls( + file_path=str(data["filePath"]), + line_start=( + int(data["lineStart"]) if data.get("lineStart") is not None else None + ), + line_end=int(data["lineEnd"]) if data.get("lineEnd") is not None else None, + path_kind=str(data.get("pathKind") or "relative"), + ) + + +@dataclass(frozen=True) +class VulnerabilityFinding: + """Canonical code vulnerability finding normalized from SARIF.""" + + finding_id: str + rule_id: str + message: str + primary_location: VulnerabilityLocation + source_tool: str + source_tool_version: str | None = None + category: str = VULN_CATEGORY + rule_name: str | None = None + severity: str = "MEDIUM" + precision: str = "UNKNOWN" + security_severity: float | None = None + cwe_ids: tuple[str, ...] = field(default_factory=tuple) + owasp_tags: tuple[str, ...] = field(default_factory=tuple) + related_locations: tuple[VulnerabilityLocation, ...] = field(default_factory=tuple) + code_flow_count: int = 0 + partial_fingerprints: dict[str, str] = field(default_factory=dict) + help_uri: str | None = None + help_markdown: str | None = None + triage_state: str = "NEEDS_REVIEW" + verifier_verdict: dict | None = None + properties: dict[str, object] = field(default_factory=dict) + + def __post_init__(self) -> None: + object.__setattr__( + self, + "source_tool", + sanitize_vulnerability_identifier(self.source_tool, fallback="sarif"), + ) + if self.source_tool_version is not None: + object.__setattr__( + self, + "source_tool_version", + sanitize_vulnerability_identifier( + self.source_tool_version, + fallback="", + ) + or None, + ) + object.__setattr__( + self, + "rule_id", + sanitize_vulnerability_identifier(self.rule_id, fallback="unknown-rule"), + ) + if self.rule_name is not None: + object.__setattr__( + self, + "rule_name", + sanitize_vulnerability_text(self.rule_name) or None, + ) + object.__setattr__(self, "message", sanitize_vulnerability_text(self.message)) + object.__setattr__( + self, + "cwe_ids", + tuple( + sanitize_vulnerability_identifier(item, fallback="") + for item in self.cwe_ids + if sanitize_vulnerability_identifier(item, fallback="") + ), + ) + object.__setattr__( + self, + "owasp_tags", + tuple( + sanitize_vulnerability_identifier(item, fallback="") + for item in self.owasp_tags + if sanitize_vulnerability_identifier(item, fallback="") + ), + ) + object.__setattr__( + self, + "partial_fingerprints", + sanitize_partial_fingerprints(self.partial_fingerprints), + ) + object.__setattr__( + self, + "help_uri", + sanitize_vulnerability_uri(self.help_uri), + ) + if self.help_markdown is not None: + object.__setattr__( + self, + "help_markdown", + sanitize_vulnerability_text(self.help_markdown) or None, + ) + object.__setattr__(self, "properties", _safe_properties(self.properties)) + + def to_dict(self) -> dict: + return { + "findingId": self.finding_id, + "category": self.category, + "sourceTool": self.source_tool, + "sourceToolVersion": self.source_tool_version, + "ruleId": self.rule_id, + "ruleName": self.rule_name, + "message": self.message, + "severity": self.severity, + "precision": self.precision, + "securitySeverity": self.security_severity, + "cweIds": list(self.cwe_ids), + "owaspTags": list(self.owasp_tags), + "primaryLocation": self.primary_location.to_dict(), + "relatedLocations": [ + location.to_dict() for location in self.related_locations + ], + "codeFlowCount": self.code_flow_count, + "partialFingerprints": dict(sorted(self.partial_fingerprints.items())), + "helpUri": self.help_uri, + "helpMarkdown": self.help_markdown, + "triageState": self.triage_state, + "verifierVerdict": self.verifier_verdict, + "properties": _json_safe_mapping(self.properties), + } + + @classmethod + def from_dict(cls, data: dict) -> VulnerabilityFinding: + return cls( + finding_id=str(data["findingId"]), + category=str(data.get("category") or VULN_CATEGORY), + source_tool=str(data["sourceTool"]), + source_tool_version=data.get("sourceToolVersion"), + rule_id=str(data["ruleId"]), + rule_name=data.get("ruleName"), + message=str(data.get("message") or ""), + severity=str(data.get("severity") or "MEDIUM"), + precision=str(data.get("precision") or "UNKNOWN"), + security_severity=( + float(data["securitySeverity"]) + if data.get("securitySeverity") is not None + else None + ), + cwe_ids=tuple(str(item) for item in data.get("cweIds", [])), + owasp_tags=tuple(str(item) for item in data.get("owaspTags", [])), + primary_location=VulnerabilityLocation.from_dict( + data["primaryLocation"] + ), + related_locations=tuple( + VulnerabilityLocation.from_dict(item) + for item in data.get("relatedLocations", []) + ), + code_flow_count=int(data.get("codeFlowCount") or 0), + partial_fingerprints={ + str(key): str(value) + for key, value in data.get("partialFingerprints", {}).items() + }, + help_uri=data.get("helpUri"), + help_markdown=data.get("helpMarkdown"), + triage_state=str(data.get("triageState") or "NEEDS_REVIEW"), + verifier_verdict=data.get("verifierVerdict"), + properties=( + data.get("properties") + if isinstance(data.get("properties"), dict) + else {} + ), + ) + + +def compute_vulnerability_finding_id( + *, + source_tool: str, + rule_id: str, + partial_fingerprints: dict[str, str] | None, + file_path: str, + line_start: int | None, + message: str, +) -> str: + """Return a deterministic code vulnerability finding id.""" + fingerprint = _select_fingerprint(partial_fingerprints or {}) + if fingerprint is not None: + material = { + "sourceTool": source_tool, + "ruleId": rule_id, + "fingerprint": fingerprint, + } + else: + material = { + "sourceTool": source_tool, + "ruleId": rule_id, + "filePath": file_path, + "lineStart": line_start, + "message": message, + } + encoded = json.dumps(material, sort_keys=True, separators=(",", ":")) + return "vuln_" + hashlib.sha1(encoded.encode("utf-8")).hexdigest()[:16] # noqa: S324 + + +def normalize_sarif_uri( + uri: object, + *, + redact_relative: bool = False, +) -> VulnerabilityLocation: + """Normalize a SARIF artifact URI without persisting private absolute paths.""" + raw = str(uri or "") + if _is_absolute_or_external_uri(raw): + return VulnerabilityLocation( + file_path=_ABSOLUTE_REDACTED_PREFIX + _hash_text(raw), + path_kind="absolute-redacted", + ) + if _escapes_relative_root(raw): + return VulnerabilityLocation( + file_path=_RELATIVE_REDACTED_PREFIX + _hash_text(raw), + path_kind="relative-redacted", + ) + if redact_relative: + return VulnerabilityLocation( + file_path=_RELATIVE_REDACTED_PREFIX + _hash_text(raw), + path_kind="relative-redacted", + ) + normalized = PurePath(raw).as_posix().lstrip("./") + return VulnerabilityLocation(file_path=normalized, path_kind="relative") + + +def location_with_region( + uri: object, + *, + start_line: object = None, + end_line: object = None, + redact_relative: bool = False, +) -> VulnerabilityLocation: + base = normalize_sarif_uri(uri, redact_relative=redact_relative) + return VulnerabilityLocation( + file_path=base.file_path, + path_kind=base.path_kind, + line_start=_optional_int(start_line), + line_end=_optional_int(end_line), + ) + + +def _select_fingerprint(partial_fingerprints: dict[str, str]) -> str | None: + if not partial_fingerprints: + return None + preferred = [ + "primaryLocationLineHash", + "primaryLocationStartColumnFingerprint", + "matchBasedId/v1", + "stableHash", + ] + for key in preferred: + value = partial_fingerprints.get(key) + if value: + return f"{key}:{value}" + key = sorted(partial_fingerprints)[0] + return f"{key}:{partial_fingerprints[key]}" + + +def _is_absolute_or_external_uri(value: str) -> bool: + parsed = urlparse(value) + return ( + value.startswith("/") + or _WINDOWS_DRIVE_RE.match(value) is not None + or parsed.scheme in {"file", "http", "https"} + ) + + +def _escapes_relative_root(value: str) -> bool: + path = PurePath(value) + return any(part == ".." for part in path.parts) + + +def _hash_text(value: str) -> str: + return hashlib.sha256(value.encode("utf-8")).hexdigest()[:16] + + +def _optional_int(value: object) -> int | None: + if value is None: + return None + try: + parsed = int(value) + except (TypeError, ValueError): + return None + return parsed if parsed >= 1 else None + + +def _json_safe_mapping(value: dict[str, object]) -> dict[str, object]: + try: + json.dumps(value) + except (TypeError, ValueError): + return {} + return dict(value) + + +def _safe_properties(properties: dict[str, object]) -> dict[str, object]: + return { + sanitize_vulnerability_identifier(key, fallback="property"): _safe_value(value) + for key, value in properties.items() + } + + +def _safe_value(value: object) -> object: + if isinstance(value, str): + return sanitize_vulnerability_identifier(value, fallback="") + if isinstance(value, list): + return [_safe_value(item) for item in value] + if isinstance(value, dict): + return _safe_properties(value) + return value diff --git a/src/security_scanner/core/vulnerability/redaction.py b/src/security_scanner/core/vulnerability/redaction.py new file mode 100644 index 0000000..d530ff3 --- /dev/null +++ b/src/security_scanner/core/vulnerability/redaction.py @@ -0,0 +1,94 @@ +"""Public-safe redaction helpers for vulnerability metadata.""" + +from __future__ import annotations + +import hashlib +import re +from urllib.parse import urlparse + +_CODE_FENCE_RE = re.compile(r"```.*?```", re.DOTALL) +_INLINE_CODE_RE = re.compile(r"`[^`\n]+`") +_PATH_LIKE_RE = re.compile( + r"(?:(?:/[A-Za-z0-9._ -]+)+|[A-Za-z]:[\\/][^\\/\s]+(?:[\\/][^\\/\s]+)*)" +) +_RELATIVE_PATH_RE = re.compile( + r"(? str: + """Return SARIF free text without source snippets, paths, or secret-like text.""" + text = " ".join(str(value or "").split()) + if not text: + return "" + text = _CODE_FENCE_RE.sub("", text) + text = _INLINE_CODE_RE.sub("", text) + text = _SECRET_LIKE_RE.sub("", text) + text = _RELATIVE_PATH_RE.sub("", text) + text = _PATH_LIKE_RE.sub("", text) + text = _CALL_LIKE_RE.sub("", text) + if len(text) > limit: + return text[: limit - 3].rstrip() + "..." + return text + + +def sanitize_vulnerability_identifier( + value: object, + *, + fallback: str = "redacted", + limit: int = 200, +) -> str: + """Return a SARIF identifier-like value without path/snippet/secret text.""" + text = " ".join(str(value or "").split()) + if not text: + return fallback + text = _CODE_FENCE_RE.sub("", text) + text = _INLINE_CODE_RE.sub("", text) + text = _SECRET_LIKE_RE.sub("", text) + text = _IDENTIFIER_RELATIVE_PATH_RE.sub("", text) + text = _PATH_LIKE_RE.sub("", text) + text = _CALL_LIKE_RE.sub("", text) + if len(text) > limit: + return text[: limit - 3].rstrip() + "..." + return text or fallback + + +def sanitize_partial_fingerprints(value: object) -> dict[str, str]: + """Normalize SARIF partial fingerprints without persisting raw values.""" + if not isinstance(value, dict): + return {} + normalized: dict[str, str] = {} + for key, raw_value in value.items(): + safe_key = sanitize_vulnerability_identifier(key, fallback="fingerprint") + normalized[safe_key] = _fingerprint_value(raw_value) + return normalized + + +def sanitize_vulnerability_uri(value: object) -> str | None: + """Return only a non-identifying URI shape marker for SARIF rule links.""" + raw = str(value or "").strip() + if not raw: + return None + parsed = urlparse(raw) + if parsed.scheme in {"http", "https"} and parsed.netloc: + return f"{parsed.scheme}://" + return "" + + +def _fingerprint_value(value: object) -> str: + text = str(value or "") + if text.startswith("sha256:"): + return text + digest = hashlib.sha256(text.encode("utf-8")).hexdigest()[:16] + return f"sha256:{digest}" diff --git a/src/security_scanner/core/vulnerability/report.py b/src/security_scanner/core/vulnerability/report.py new file mode 100644 index 0000000..4e90eb3 --- /dev/null +++ b/src/security_scanner/core/vulnerability/report.py @@ -0,0 +1,59 @@ +"""Human-readable reports for code vulnerability findings.""" + +from __future__ import annotations + +from collections import Counter, defaultdict + +from security_scanner.core.vulnerability.model import VulnerabilityFinding + +_SEVERITY_ORDER = ("CRITICAL", "HIGH", "MEDIUM", "LOW", "INFO", "UNKNOWN") +_PRECISION_ORDER = ("VERY_HIGH", "HIGH", "MEDIUM", "LOW", "UNKNOWN") + + +def render_vulnerability_report(findings: list[VulnerabilityFinding]) -> str: + """Render a public-safe plain-text code vulnerability report.""" + lines = [ + "Code Vulnerability Report", + "=========================", + f"Total findings: {len(findings)}", + "", + "By severity:", + ] + severity_counts = Counter(finding.severity for finding in findings) + for severity in _SEVERITY_ORDER: + lines.append(f" {severity:<20} {severity_counts.get(severity, 0)}") + lines.extend(["", "By precision:"]) + precision_counts = Counter(finding.precision for finding in findings) + for precision in _PRECISION_ORDER: + lines.append(f" {precision:<20} {precision_counts.get(precision, 0)}") + lines.append("") + if not findings: + lines.append("No findings.") + return "\n".join(lines) + + by_rule: dict[str, list[VulnerabilityFinding]] = defaultdict(list) + for finding in findings: + by_rule[finding.rule_id].append(finding) + + for rule_id in sorted(by_rule): + group = sorted( + by_rule[rule_id], + key=lambda item: ( + item.primary_location.file_path, + item.primary_location.line_start or 0, + item.finding_id, + ), + ) + lines.append(f"[{rule_id}] ({len(group)})") + for finding in group: + location = finding.primary_location + line = location.line_start or 0 + cwes = ",".join(finding.cwe_ids) if finding.cwe_ids else "no-cwe" + lines.append( + f" {location.file_path}:{line} " + f"{finding.severity}/{finding.precision} {cwes} " + f"id:{finding.finding_id}" + ) + lines.append("") + + return "\n".join(lines).rstrip() + "\n" diff --git a/src/security_scanner/core/vulnerability/sarif.py b/src/security_scanner/core/vulnerability/sarif.py new file mode 100644 index 0000000..68422b5 --- /dev/null +++ b/src/security_scanner/core/vulnerability/sarif.py @@ -0,0 +1,298 @@ +"""SARIF 2.1.0 importer for code vulnerability findings.""" + +from __future__ import annotations + +import json +import re +from pathlib import Path +from typing import Any + +from security_scanner.core.vulnerability.model import ( + VulnerabilityFinding, + VulnerabilityLocation, + compute_vulnerability_finding_id, + location_with_region, +) +from security_scanner.core.vulnerability.redaction import ( + sanitize_partial_fingerprints, + sanitize_vulnerability_identifier, + sanitize_vulnerability_text, + sanitize_vulnerability_uri, +) + +_CWE_RE = re.compile(r"(?i)cwe[-_/ ]?(\d{1,5})") + + +class SarifImportError(ValueError): + """Raised when a SARIF payload cannot be normalized.""" + + +def import_sarif_file( + path: str | Path, + *, + path_policy: str = "synthetic", +) -> list[VulnerabilityFinding]: + """Load SARIF from *path* and return normalized vulnerability findings.""" + try: + payload = json.loads(Path(path).read_text(encoding="utf-8")) + except json.JSONDecodeError as exc: + raise SarifImportError(f"invalid SARIF JSON: {exc.msg}") from exc + return import_sarif_payload(payload, path_policy=path_policy) + + +def import_sarif_payload( + payload: dict[str, Any], + *, + path_policy: str = "synthetic", +) -> list[VulnerabilityFinding]: + """Normalize a SARIF payload into deterministic vulnerability findings.""" + if path_policy not in {"synthetic", "redacted"}: + raise SarifImportError("path_policy must be synthetic or redacted") + if not isinstance(payload, dict): + raise SarifImportError("SARIF payload must be a JSON object") + runs = payload.get("runs") + if not isinstance(runs, list): + raise SarifImportError("SARIF payload must contain a runs list") + + findings: list[VulnerabilityFinding] = [] + for run in runs: + if not isinstance(run, dict): + continue + tool = run.get("tool") if isinstance(run.get("tool"), dict) else {} + driver = tool.get("driver") if isinstance(tool.get("driver"), dict) else {} + source_tool = sanitize_vulnerability_identifier( + driver.get("name"), + fallback="sarif", + ) + source_tool_version = driver.get("semanticVersion") or driver.get("version") + rules = _rules_by_id(driver.get("rules", [])) + for result in run.get("results", []) or []: + if not isinstance(result, dict): + continue + findings.append( + _finding_from_result( + result, + rules=rules, + source_tool=source_tool, + source_tool_version=( + str(source_tool_version) if source_tool_version else None + ), + redact_relative=path_policy == "redacted", + ) + ) + findings.sort(key=lambda item: item.finding_id) + return findings + + +def _finding_from_result( + result: dict[str, Any], + *, + rules: dict[str, dict[str, Any]], + source_tool: str, + source_tool_version: str | None, + redact_relative: bool, +) -> VulnerabilityFinding: + rule_id = str(result.get("ruleId") or "") + rule = rules.get(rule_id, {}) + message = _message_text(result.get("message")) or _message_text(rule.get("message")) + primary = _primary_location(result, redact_relative=redact_relative) + partial_fingerprints = { + str(key): str(value) + for key, value in (result.get("partialFingerprints") or {}).items() + } + finding_id = compute_vulnerability_finding_id( + source_tool=source_tool, + rule_id=rule_id, + partial_fingerprints=partial_fingerprints, + file_path=primary.file_path, + line_start=primary.line_start, + message=message, + ) + properties = _merge_properties(rule.get("properties"), result.get("properties")) + tags = _string_list(properties.get("tags")) + _string_list(rule.get("tags")) + cwe_ids = tuple(sorted(_extract_cwe_ids(tags))) + owasp_tags = tuple(sorted(_extract_owasp_tags(tags))) + return VulnerabilityFinding( + finding_id=finding_id, + source_tool=source_tool, + source_tool_version=source_tool_version, + rule_id=sanitize_vulnerability_identifier(rule_id, fallback="unknown-rule"), + rule_name=_rule_name(rule), + message=message, + severity=_resolve_severity(result, properties), + precision=_normalize_precision(properties.get("precision")), + security_severity=_security_severity(properties), + cwe_ids=cwe_ids, + owasp_tags=owasp_tags, + primary_location=primary, + related_locations=_related_locations(result, redact_relative=redact_relative), + code_flow_count=len(result.get("codeFlows") or []), + partial_fingerprints=sanitize_partial_fingerprints(partial_fingerprints), + help_uri=sanitize_vulnerability_uri(rule.get("helpUri")), + help_markdown=_help_markdown(rule), + properties=_safe_properties(properties), + ) + + +def _rules_by_id(raw_rules: object) -> dict[str, dict[str, Any]]: + rules = {} + if not isinstance(raw_rules, list): + return rules + for rule in raw_rules: + if isinstance(rule, dict) and rule.get("id") is not None: + rules[str(rule["id"])] = rule + return rules + + +def _message_text(message: object) -> str: + if isinstance(message, dict): + text = message.get("text") or message.get("markdown") + return sanitize_vulnerability_text(text) + return "" + + +def _primary_location( + result: dict[str, Any], + *, + redact_relative: bool, +) -> VulnerabilityLocation: + locations = result.get("locations") or [] + if not locations: + return VulnerabilityLocation(file_path="", path_kind="missing") + return _physical_location(locations[0], redact_relative=redact_relative) + + +def _related_locations( + result: dict[str, Any], + *, + redact_relative: bool, +) -> tuple[VulnerabilityLocation, ...]: + locations: list[VulnerabilityLocation] = [] + for item in result.get("relatedLocations") or []: + if isinstance(item, dict): + locations.append(_physical_location(item, redact_relative=redact_relative)) + return tuple(locations) + + +def _physical_location( + location: dict[str, Any], + *, + redact_relative: bool, +) -> VulnerabilityLocation: + physical = location.get("physicalLocation") + if not isinstance(physical, dict): + return VulnerabilityLocation(file_path="", path_kind="missing") + artifact = physical.get("artifactLocation") + if not isinstance(artifact, dict): + artifact = {} + region = physical.get("region") + if not isinstance(region, dict): + region = {} + return location_with_region( + artifact.get("uri") or "", + start_line=region.get("startLine"), + end_line=region.get("endLine"), + redact_relative=redact_relative, + ) + + +def _merge_properties(*values: object) -> dict[str, object]: + merged: dict[str, object] = {} + for value in values: + if isinstance(value, dict): + merged.update(value) + return merged + + +def _rule_name(rule: dict[str, Any]) -> str | None: + name = rule.get("name") or _message_text(rule.get("shortDescription")) + sanitized = sanitize_vulnerability_text(name) + return sanitized or None + + +def _resolve_severity(result: dict[str, Any], properties: dict[str, object]) -> str: + security_severity = _security_severity(properties) + if security_severity is not None: + if security_severity >= 9.0: + return "CRITICAL" + if security_severity >= 7.0: + return "HIGH" + if security_severity >= 4.0: + return "MEDIUM" + return "LOW" + level = str(result.get("level") or properties.get("problem.severity") or "").lower() + if level in {"error", "critical"}: + return "HIGH" + if level in {"warning", "warn"}: + return "MEDIUM" + if level in {"note", "none"}: + return "LOW" + return "MEDIUM" + + +def _security_severity(properties: dict[str, object]) -> float | None: + raw = properties.get("security-severity") or properties.get("securitySeverity") + try: + return float(raw) + except (TypeError, ValueError): + return None + + +def _normalize_precision(raw: object) -> str: + if raw is None: + return "UNKNOWN" + value = str(raw).strip().replace("-", "_").upper() + if value in {"VERY_HIGH", "HIGH", "MEDIUM", "LOW"}: + return value + return "UNKNOWN" + + +def _help_markdown(rule: dict[str, Any]) -> str | None: + help_value = rule.get("help") + if isinstance(help_value, dict): + value = help_value.get("markdown") or help_value.get("text") + sanitized = sanitize_vulnerability_text(value) + return sanitized or None + return None + + +def _string_list(value: object) -> list[str]: + if isinstance(value, list): + return [str(item) for item in value if isinstance(item, str)] + return [] + + +def _extract_cwe_ids(tags: list[str]) -> set[str]: + cwes = set() + for tag in tags: + match = _CWE_RE.search(tag) + if match: + cwes.add(f"CWE-{int(match.group(1))}") + return cwes + + +def _extract_owasp_tags(tags: list[str]) -> set[str]: + return {tag for tag in tags if "owasp" in tag.lower()} + + +def _safe_properties(properties: dict[str, object]) -> dict[str, object]: + allowed = { + "precision", + "security-severity", + "securitySeverity", + "problem.severity", + "tags", + } + return { + key: _safe_property_value(value) + for key, value in properties.items() + if key in allowed + } + + +def _safe_property_value(value: object) -> object: + if isinstance(value, list): + return [sanitize_vulnerability_identifier(item, fallback="") for item in value] + if isinstance(value, str): + return sanitize_vulnerability_identifier(value, fallback="") + return value diff --git a/src/security_scanner/llm/vulnerability/__init__.py b/src/security_scanner/llm/vulnerability/__init__.py new file mode 100644 index 0000000..856fafc --- /dev/null +++ b/src/security_scanner/llm/vulnerability/__init__.py @@ -0,0 +1,17 @@ +"""LLM verifier/explainer for code vulnerability findings.""" + +from security_scanner.llm.vulnerability.prompt import build_vulnerability_prompt +from security_scanner.llm.vulnerability.verifier import ( + VulnerabilityOllamaVerifier, + VulnerabilityVerifierResult, + apply_vulnerability_verifier_result, + parse_vulnerability_verifier_response, +) + +__all__ = [ + "VulnerabilityOllamaVerifier", + "VulnerabilityVerifierResult", + "apply_vulnerability_verifier_result", + "build_vulnerability_prompt", + "parse_vulnerability_verifier_response", +] diff --git a/src/security_scanner/llm/vulnerability/prompt.py b/src/security_scanner/llm/vulnerability/prompt.py new file mode 100644 index 0000000..7241a4f --- /dev/null +++ b/src/security_scanner/llm/vulnerability/prompt.py @@ -0,0 +1,86 @@ +"""Redacted prompt construction for vulnerability verifier adapters.""" + +from __future__ import annotations + +import hashlib +import json +from pathlib import PurePath + +from security_scanner.core.vulnerability.model import VulnerabilityFinding +from security_scanner.core.vulnerability.redaction import ( + sanitize_vulnerability_identifier, + sanitize_vulnerability_text, +) + + +def build_vulnerability_prompt(finding: VulnerabilityFinding) -> str: + """Build a prompt from public-safe vulnerability metadata only.""" + metadata = { + "findingId": finding.finding_id, + "category": finding.category, + "sourceTool": _safe_identifier(finding.source_tool), + "ruleId": _safe_identifier(finding.rule_id), + "ruleName": _safe_text(finding.rule_name), + "severity": finding.severity, + "precision": finding.precision, + "securitySeverity": finding.security_severity, + "cweIds": [_safe_identifier(item) for item in finding.cwe_ids], + "owaspTags": [_safe_identifier(item) for item in finding.owasp_tags], + "location": { + "pathKind": finding.primary_location.path_kind, + "pathFingerprint": _fingerprint(finding.primary_location.file_path), + "fileExtension": _file_extension(finding.primary_location.file_path), + "lineStart": finding.primary_location.line_start, + "lineEnd": finding.primary_location.line_end, + }, + "trace": { + "codeFlowCount": finding.code_flow_count, + "relatedLocationCount": len(finding.related_locations), + }, + "ruleHelp": _safe_help(finding.help_markdown), + } + return "\n".join( + [ + "You are a code vulnerability verifier and explainer.", + "Use only the redacted metadata below.", + "Do not request or reveal raw source snippets, repository names, hosts, " + "or paths.", + "Return strict JSON only with keys: label, confidence, reason, " + "remediation.", + "Allowed label values: true_positive, false_positive, needs_review.", + "confidence must be a JSON number between 0.0 and 1.0.", + "reason must explain risk from rule/CWE/location-shape metadata only.", + "remediation must be generic guidance, not a concrete code patch.", + "Do not choose false_positive only because source code is redacted.", + "Finding metadata:", + json.dumps(metadata, sort_keys=True, separators=(",", ":")), + ] + ) + + +def _fingerprint(value: str) -> str: + digest = hashlib.sha256(value.encode("utf-8")).hexdigest()[:16] + return f"sha256:{digest}" + + +def _file_extension(file_path: str) -> str: + suffix = PurePath(file_path).suffix + if not suffix or len(suffix) > 16: + return "" + return suffix + + +def _safe_help(value: str | None) -> str | None: + if not value: + return None + sanitized = _safe_text(value) + return sanitized or None + + +def _safe_identifier(value: str | None) -> str: + return sanitize_vulnerability_identifier(value, fallback="redacted") + + +def _safe_text(value: str | None) -> str | None: + sanitized = sanitize_vulnerability_text(value) + return sanitized or None diff --git a/src/security_scanner/llm/vulnerability/verifier.py b/src/security_scanner/llm/vulnerability/verifier.py new file mode 100644 index 0000000..75b37d2 --- /dev/null +++ b/src/security_scanner/llm/vulnerability/verifier.py @@ -0,0 +1,222 @@ +"""Verifier contract for code vulnerability findings.""" + +from __future__ import annotations + +import json +import os +import re +import urllib.error +import urllib.request +from collections.abc import Callable +from dataclasses import dataclass +from urllib.parse import urlparse + +from security_scanner.core.vulnerability.model import VulnerabilityFinding +from security_scanner.llm.common.verifier import ( + VerifierConfig, + parse_verifier_response, +) +from security_scanner.llm.vulnerability.prompt import build_vulnerability_prompt + +Transport = Callable[[dict, float], str] + +_VERIFIER_RESPONSE_SCHEMA = { + "type": "object", + "properties": { + "label": { + "type": "string", + "enum": ["true_positive", "false_positive", "needs_review"], + }, + "confidence": {"type": "number", "minimum": 0.0, "maximum": 1.0}, + "reason": {"type": "string"}, + "remediation": {"type": "string"}, + }, + "required": ["label", "confidence", "reason", "remediation"], + "additionalProperties": False, +} +_PATH_LIKE_RE = re.compile(r"(?:(?:/[A-Za-z0-9._ -]+)+|[A-Za-z]:\\[^\\\s]+)") + + +@dataclass(frozen=True) +class VulnerabilityVerifierResult: + verdict: str + confidence: float + reason: str + remediation: str | None = None + raw_label: str | None = None + error: str | None = None + + @classmethod + def needs_review( + cls, + reason: str, + *, + raw_label: str | None = None, + error: str | None = None, + ) -> VulnerabilityVerifierResult: + return cls( + verdict="NEEDS_REVIEW", + confidence=0.0, + reason=reason, + remediation=None, + raw_label=raw_label, + error=error, + ) + + +def parse_vulnerability_verifier_response( + raw_content: str, + *, + min_confidence: float, +) -> VulnerabilityVerifierResult: + """Validate a strict JSON vulnerability verifier response.""" + common = parse_verifier_response(raw_content, min_confidence=min_confidence) + remediation = _extract_remediation(raw_content) + return VulnerabilityVerifierResult( + verdict=common.verdict, + confidence=common.confidence, + reason=common.reason, + remediation=remediation if common.error is None else None, + raw_label=common.raw_label, + error=common.error, + ) + + +def apply_vulnerability_verifier_result( + finding: VulnerabilityFinding, + result: VulnerabilityVerifierResult, + *, + verifier_name: str = "ollama", +) -> VulnerabilityFinding: + """Return a copy of *finding* with verifier review assistance attached.""" + data = finding.to_dict() + forbidden = [ + finding.primary_location.file_path, + finding.rule_id, + ] + data["triageState"] = result.verdict + data["verifierVerdict"] = { + "verifier": verifier_name, + "verdict": result.verdict, + "confidence": result.confidence, + "reason": _sanitize_text(result.reason, forbidden_values=forbidden), + "remediation": ( + _sanitize_text(result.remediation, forbidden_values=forbidden) + if result.remediation + else None + ), + "error": result.error, + } + return VulnerabilityFinding.from_dict(data) + + +class VulnerabilityOllamaVerifier: + """Ollama-compatible verifier for redacted vulnerability findings.""" + + def __init__( + self, + config: VerifierConfig, + transport: Transport | None = None, + ) -> None: + self.config = config + self._transport = transport or self._default_transport + + def verify(self, finding: VulnerabilityFinding) -> VulnerabilityVerifierResult: + payload = { + "model": self.config.model, + "stream": False, + "format": _VERIFIER_RESPONSE_SCHEMA, + "options": {"temperature": 0}, + "messages": [ + { + "role": "system", + "content": ( + "You classify redacted static-analysis findings. " + "Return only strict JSON." + ), + }, + {"role": "user", "content": build_vulnerability_prompt(finding)}, + ], + } + try: + raw_response = self._transport(payload, self.config.timeout_seconds) + except TimeoutError: + return VulnerabilityVerifierResult.needs_review( + "Verifier timed out.", + error="timeout", + ) + except (OSError, urllib.error.URLError, urllib.error.HTTPError) as exc: + return VulnerabilityVerifierResult.needs_review( + "Verifier transport failed.", + error=exc.__class__.__name__, + ) + return parse_vulnerability_verifier_response( + _extract_content(raw_response), + min_confidence=self.config.min_confidence, + ) + + def _default_transport(self, payload: dict, timeout_seconds: float) -> str: + request = urllib.request.Request( + _chat_url(self.config.host), + data=json.dumps(payload).encode("utf-8"), + headers=_headers(self.config.api_key_env), + method="POST", + ) + with urllib.request.urlopen(request, timeout=timeout_seconds) as response: # noqa: S310 + return response.read().decode("utf-8") + + +def _extract_remediation(raw_content: str) -> str | None: + try: + data = json.loads(raw_content) + except json.JSONDecodeError: + return None + if isinstance(data, dict) and data.get("remediation") is not None: + return _sanitize_text(str(data.get("remediation"))) + return None + + +def _sanitize_text( + value: str | None, + *, + forbidden_values: list[str] | None = None, +) -> str: + cleaned = value or "Verifier did not provide a reason." + for forbidden in forbidden_values or []: + if forbidden: + cleaned = cleaned.replace(forbidden, "") + cleaned = _PATH_LIKE_RE.sub("", cleaned) + if len(cleaned) > 500: + cleaned = cleaned[:497].rstrip() + "..." + return cleaned + + +def _chat_url(host: str) -> str: + trimmed = host.rstrip("/") + parsed = urlparse(trimmed) + if parsed.path.endswith("/api/chat"): + return trimmed + return f"{trimmed}/api/chat" + + +def _headers(api_key_env: str | None) -> dict[str, str]: + headers = {"Content-Type": "application/json"} + if api_key_env: + token = os.environ.get(api_key_env) + if token: + headers["Authorization"] = f"Bearer {token}" + return headers + + +def _extract_content(raw_response: str) -> str: + try: + data = json.loads(raw_response) + except json.JSONDecodeError: + return raw_response + if isinstance(data, dict): + message = data.get("message") + if isinstance(message, dict) and isinstance(message.get("content"), str): + return message["content"] + if isinstance(data.get("response"), str): + return data["response"] + return raw_response diff --git a/src/security_scanner/runtime/vulnerability_scan.py b/src/security_scanner/runtime/vulnerability_scan.py new file mode 100644 index 0000000..1f93e6b --- /dev/null +++ b/src/security_scanner/runtime/vulnerability_scan.py @@ -0,0 +1,103 @@ +"""Runtime use cases for SARIF-native code vulnerability scanning.""" + +from __future__ import annotations + +import tempfile +from dataclasses import dataclass +from pathlib import Path + +from security_scanner.core.vulnerability.sarif import import_sarif_file +from security_scanner.scanners.semgrep_compatible import SemgrepCompatibleRunner +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + + +@dataclass(frozen=True) +class ImportSarifRequest: + sarif_path: str | Path + output_path: str | Path + path_policy: str = "synthetic" + + +@dataclass(frozen=True) +class ImportSarifResult: + output_path: Path + finding_count: int + + +@dataclass(frozen=True) +class ScanVulnerabilityRequest: + root: str | Path + output_path: str | Path + sarif_output_path: str | Path | None = None + semgrep_binary: str = "semgrep" + semgrep_config: str = "auto" + timeout_seconds: int = 300 + path_policy: str = "redacted" + + +@dataclass(frozen=True) +class ScanVulnerabilityResult: + output_path: Path + sarif_path: Path + finding_count: int + + +def run_import_sarif(request: ImportSarifRequest) -> ImportSarifResult: + findings = import_sarif_file(request.sarif_path, path_policy=request.path_policy) + output = Path(request.output_path) + VulnerabilityJsonlStore(output).write_all(findings) + return ImportSarifResult(output_path=output, finding_count=len(findings)) + + +def run_vulnerability_scan( + request: ScanVulnerabilityRequest, + *, + runner: SemgrepCompatibleRunner | None = None, +) -> ScanVulnerabilityResult: + root = Path(request.root) + if request.sarif_output_path is not None: + sarif_path = Path(request.sarif_output_path) + sarif_path.parent.mkdir(parents=True, exist_ok=True) + return _scan_and_import(root, sarif_path, request, runner=runner) + + with tempfile.TemporaryDirectory(prefix="security-scanner-code-vuln-") as tmp: + sarif_path = Path(tmp) / "semgrep.sarif" + return _scan_and_import(root, sarif_path, request, runner=runner) + + +def _scan_and_import( + root: Path, + sarif_path: Path, + request: ScanVulnerabilityRequest, + *, + runner: SemgrepCompatibleRunner | None, +) -> ScanVulnerabilityResult: + active_runner = runner or _runner_from_request(request) + active_runner.run(root, sarif_path=sarif_path) + result = _import_scan_sarif(sarif_path, request) + return ScanVulnerabilityResult( + output_path=result.output_path, + sarif_path=sarif_path, + finding_count=result.finding_count, + ) + + +def _runner_from_request(request: ScanVulnerabilityRequest) -> SemgrepCompatibleRunner: + return SemgrepCompatibleRunner( + binary=request.semgrep_binary, + config=request.semgrep_config, + timeout_seconds=request.timeout_seconds, + ) + + +def _import_scan_sarif( + sarif_path: Path, + request: ScanVulnerabilityRequest, +) -> ImportSarifResult: + return run_import_sarif( + ImportSarifRequest( + sarif_path=sarif_path, + output_path=request.output_path, + path_policy=request.path_policy, + ) + ) diff --git a/src/security_scanner/runtime/vulnerability_verify_artifact.py b/src/security_scanner/runtime/vulnerability_verify_artifact.py new file mode 100644 index 0000000..713ae8e --- /dev/null +++ b/src/security_scanner/runtime/vulnerability_verify_artifact.py @@ -0,0 +1,64 @@ +"""Verifier artifact runtime for code vulnerability findings.""" + +from __future__ import annotations + +from collections.abc import Callable +from dataclasses import dataclass +from pathlib import Path +from typing import Protocol + +from security_scanner.core.vulnerability.model import VulnerabilityFinding +from security_scanner.llm.common.verifier import VerifierConfig +from security_scanner.llm.vulnerability.verifier import ( + VulnerabilityOllamaVerifier, + VulnerabilityVerifierResult, + apply_vulnerability_verifier_result, +) +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + + +class VulnerabilityVerifier(Protocol): + def verify(self, finding: VulnerabilityFinding) -> VulnerabilityVerifierResult: + """Return review assistance for one vulnerability finding.""" + + +VulnerabilityVerifierFactory = Callable[[VerifierConfig], VulnerabilityVerifier] + + +@dataclass(frozen=True) +class VerifyVulnerabilityArtifactRequest: + findings_path: str | Path + output_path: str | Path + config: VerifierConfig + + +@dataclass(frozen=True) +class VerifyVulnerabilityArtifactResult: + output_path: Path + finding_count: int + + +def run_verify_vulnerability_artifact( + request: VerifyVulnerabilityArtifactRequest, + *, + verifier_factory: VulnerabilityVerifierFactory | None = None, +) -> VerifyVulnerabilityArtifactResult: + input_path = Path(request.findings_path) + output_path = Path(request.output_path) + if input_path.resolve() == output_path.resolve(): + raise ValueError("--output must be different from --findings") + findings = VulnerabilityJsonlStore(input_path).read_all() + verifier = (verifier_factory or VulnerabilityOllamaVerifier)(request.config) + verified = [ + apply_vulnerability_verifier_result( + finding, + verifier.verify(finding), + verifier_name=request.config.model, + ) + for finding in findings + ] + VulnerabilityJsonlStore(output_path).write_all(verified) + return VerifyVulnerabilityArtifactResult( + output_path=output_path, + finding_count=len(verified), + ) diff --git a/src/security_scanner/scanners/semgrep_compatible/__init__.py b/src/security_scanner/scanners/semgrep_compatible/__init__.py new file mode 100644 index 0000000..c2d6da3 --- /dev/null +++ b/src/security_scanner/scanners/semgrep_compatible/__init__.py @@ -0,0 +1,8 @@ +"""Semgrep-compatible SARIF adapter.""" + +from security_scanner.scanners.semgrep_compatible.runner import ( + SemgrepCompatibleRunner, + SemgrepExecutionError, +) + +__all__ = ["SemgrepCompatibleRunner", "SemgrepExecutionError"] diff --git a/src/security_scanner/scanners/semgrep_compatible/runner.py b/src/security_scanner/scanners/semgrep_compatible/runner.py new file mode 100644 index 0000000..8927fb6 --- /dev/null +++ b/src/security_scanner/scanners/semgrep_compatible/runner.py @@ -0,0 +1,86 @@ +"""Semgrep-compatible subprocess runner that emits SARIF.""" + +from __future__ import annotations + +import re +import subprocess +from dataclasses import dataclass, field +from pathlib import Path + +_PATH_LIKE_RE = re.compile(r"(?:(?:/[A-Za-z0-9._ -]+)+|[A-Za-z]:\\[^\\\s]+)") +_SECRET_LIKE_RE = re.compile( + r"(?i)(AKIA[0-9A-Z]{12,}|(token|secret|password|api[_-]?key)\s*[:=]\s*\S+)" +) +_DETAIL_LIMIT = 240 + + +class SemgrepExecutionError(RuntimeError): + """Raised when a Semgrep-compatible process cannot produce SARIF.""" + + +@dataclass(frozen=True) +class SemgrepCompatibleRunner: + """Build and execute a Semgrep CE-compatible SARIF scan command.""" + + binary: str = "semgrep" + config: str = "auto" + timeout_seconds: int = 300 + extra_args: tuple[str, ...] = field(default_factory=tuple) + + def build_command(self, root: Path, *, sarif_path: str | Path) -> list[str]: + return [ + self.binary, + "scan", + "--config", + self.config, + "--metrics=off", + "--sarif", + f"--sarif-output={sarif_path}", + *self.extra_args, + str(root), + ] + + def run(self, root: Path, *, sarif_path: str | Path) -> str: + root = Path(root) + output = Path(sarif_path) + cmd = self.build_command(root, sarif_path=output) + try: + proc = subprocess.run( + cmd, + capture_output=True, + text=True, + check=False, + timeout=self.timeout_seconds, + cwd=str(root), + ) + except FileNotFoundError as exc: + raise SemgrepExecutionError( + f"semgrep-compatible binary not found: {self.binary}" + ) from exc + except subprocess.TimeoutExpired as exc: + raise SemgrepExecutionError( + f"semgrep-compatible scan timed out after {self.timeout_seconds}s" + ) from exc + if proc.returncode != 0: + detail = _sanitize_process_detail( + proc.stderr or proc.stdout or "no process output" + ) + raise SemgrepExecutionError( + "semgrep-compatible scan failed with " + f"exit code {proc.returncode}: {detail}" + ) + try: + return output.read_text(encoding="utf-8") + except FileNotFoundError as exc: + raise SemgrepExecutionError( + "semgrep-compatible scan did not write SARIF output" + ) from exc + + +def _sanitize_process_detail(value: str) -> str: + collapsed = " ".join(str(value or "no process output").split()) + collapsed = _PATH_LIKE_RE.sub("", collapsed) + collapsed = _SECRET_LIKE_RE.sub("", collapsed) + if len(collapsed) > _DETAIL_LIMIT: + return collapsed[: _DETAIL_LIMIT - 3].rstrip() + "..." + return collapsed diff --git a/src/security_scanner/storage/vulnerability_jsonl_store.py b/src/security_scanner/storage/vulnerability_jsonl_store.py new file mode 100644 index 0000000..978dc05 --- /dev/null +++ b/src/security_scanner/storage/vulnerability_jsonl_store.py @@ -0,0 +1,71 @@ +"""JSONL artifact persistence for code vulnerability findings.""" + +from __future__ import annotations + +import json +from collections.abc import Iterable +from pathlib import Path + +from security_scanner.core.vulnerability.model import ( + VULN_CATEGORY, + VULN_ENTITY_TYPE, + VULN_SCHEMA_VERSION, + VulnerabilityFinding, +) + + +class VulnerabilityJsonlStore: + """Read/write schema-versioned ``VULN_FINDING`` JSONL artifacts.""" + + def __init__(self, path: str | Path) -> None: + self._path = Path(path) + + def write_all(self, findings: Iterable[VulnerabilityFinding]) -> None: + self._ensure_parent() + ordered = sorted(findings, key=lambda item: item.finding_id) + with self._path.open("w", encoding="utf-8") as fh: + for finding in ordered: + fh.write(json.dumps(_record(finding), sort_keys=True) + "\n") + + def read_all(self) -> list[VulnerabilityFinding]: + if not self._path.exists(): + return [] + findings: list[VulnerabilityFinding] = [] + with self._path.open("r", encoding="utf-8") as fh: + for line_number, line in enumerate(fh, start=1): + line = line.strip() + if not line: + continue + findings.append(_finding_from_record(json.loads(line), line_number)) + return findings + + def clear(self) -> None: + self._ensure_parent() + self._path.write_text("", encoding="utf-8") + + def _ensure_parent(self) -> None: + self._path.parent.mkdir(parents=True, exist_ok=True) + + +def _record(finding: VulnerabilityFinding) -> dict: + return { + "entityType": VULN_ENTITY_TYPE, + "schemaVersion": VULN_SCHEMA_VERSION, + "finding": finding.to_dict(), + } + + +def _finding_from_record(record: dict, line_number: int) -> VulnerabilityFinding: + if record.get("entityType") != VULN_ENTITY_TYPE: + raise ValueError(f"line {line_number}: expected entityType {VULN_ENTITY_TYPE}") + if int(record.get("schemaVersion", 0)) != VULN_SCHEMA_VERSION: + raise ValueError( + f"line {line_number}: unsupported schemaVersion " + f"{record.get('schemaVersion')}" + ) + finding = record.get("finding") + if not isinstance(finding, dict): + raise ValueError(f"line {line_number}: finding must be an object") + if finding.get("category", VULN_CATEGORY) != VULN_CATEGORY: + raise ValueError(f"line {line_number}: expected category {VULN_CATEGORY}") + return VulnerabilityFinding.from_dict(finding) diff --git a/tests/test_cli.py b/tests/test_cli.py index 1e4086b..4db5694 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -194,7 +194,10 @@ def fake_run_local_scan(request): ], ) - monkeypatch.setattr("security_scanner.cli.commands.scan.run_local_scan", fake_run_local_scan) + monkeypatch.setattr( + "security_scanner.cli.commands.scan.run_local_scan", + fake_run_local_scan, + ) assert main([ "scan", @@ -414,7 +417,9 @@ def test_scan_skips_unavailable_target(tmp_path): manifest.write_text( "version: 1\n" "targets:\n" - " - name: ghost-org/ghost-repo\n path: /nonexistent/demo-path\n enabled: true\n", + " - name: ghost-org/ghost-repo\n" + " path: /nonexistent/demo-path\n" + " enabled: true\n", encoding="utf-8", ) out = tmp_path / "findings.jsonl" @@ -708,6 +713,8 @@ def test_subcommand_registration_order_is_stable(): "residual", "residual-diff", "scan-all", + "import-sarif", + "scan-vuln", "scan-health", "report", "gate", diff --git a/tests/test_cli_code_vuln.py b/tests/test_cli_code_vuln.py new file mode 100644 index 0000000..513b364 --- /dev/null +++ b/tests/test_cli_code_vuln.py @@ -0,0 +1,229 @@ +from __future__ import annotations + +import json + +from security_scanner.cli import main +from security_scanner.core.vulnerability.model import ( + VulnerabilityFinding, + VulnerabilityLocation, +) +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + + +def _sarif_payload() -> dict: + return { + "version": "2.1.0", + "runs": [ + { + "tool": { + "driver": { + "name": "Semgrep OSS", + "semanticVersion": "1.2.3", + "rules": [ + { + "id": "python.lang.security.audit.sql-injection", + "name": "SQL injection", + "properties": { + "precision": "high", + "security-severity": "8.2", + "tags": ["external/cwe/cwe-089"], + }, + } + ], + } + }, + "results": [ + { + "ruleId": "python.lang.security.audit.sql-injection", + "message": {"text": "Potential SQL injection."}, + "locations": [ + { + "physicalLocation": { + "artifactLocation": {"uri": "src/app.py"}, + "region": {"startLine": 12}, + } + } + ], + "partialFingerprints": { + "primaryLocationLineHash": "abc123" + }, + } + ], + } + ], + } + + +def _finding(**overrides) -> VulnerabilityFinding: + defaults = dict( + finding_id="vuln_1234567890abcdef", + source_tool="semgrep", + rule_id="python.lang.security.audit.sql-injection", + rule_name="SQL injection", + message="Potential SQL injection.", + severity="HIGH", + precision="HIGH", + cwe_ids=("CWE-89",), + primary_location=VulnerabilityLocation( + file_path="src/app.py", + line_start=12, + ), + ) + defaults.update(overrides) + return VulnerabilityFinding(**defaults) + + +def test_import_sarif_cli_writes_vuln_finding_artifact(tmp_path, capsys): + sarif = tmp_path / "result.sarif" + output = tmp_path / "vuln-findings.jsonl" + sarif.write_text(json.dumps(_sarif_payload()), encoding="utf-8") + + assert main(["import-sarif", "--sarif", str(sarif), "-o", str(output)]) == 0 + + assert len(VulnerabilityJsonlStore(output).read_all()) == 1 + captured = capsys.readouterr().out + assert "Imported 1 code-vuln finding(s)" in captured + + +def test_import_sarif_cli_can_redact_relative_paths(tmp_path): + sarif = tmp_path / "result.sarif" + output = tmp_path / "vuln-findings.jsonl" + sarif.write_text(json.dumps(_sarif_payload()), encoding="utf-8") + + assert ( + main( + [ + "import-sarif", + "--sarif", + str(sarif), + "-o", + str(output), + "--path-policy", + "redacted", + ] + ) + == 0 + ) + + finding = VulnerabilityJsonlStore(output).read_all()[0] + assert finding.primary_location.path_kind == "relative-redacted" + assert "src/app.py" not in finding.primary_location.file_path + + +def test_code_vuln_report_gate_and_evaluate(tmp_path, capsys): + findings = tmp_path / "vuln-findings.jsonl" + expected = tmp_path / "expected.json" + VulnerabilityJsonlStore(findings).write_all([_finding()]) + expected.write_text( + json.dumps( + { + "schemaVersion": 1, + "name": "synthetic-code-vuln", + "expectedFindings": [ + { + "filePath": "src/app.py", + "lineStart": 12, + "ruleId": "python.lang.security.audit.sql-injection", + } + ], + } + ), + encoding="utf-8", + ) + + assert main(["report", "--category", "code-vuln", "--findings", str(findings)]) == 0 + assert "Code Vulnerability Report" in capsys.readouterr().out + + assert main(["gate", "--category", "code-vuln", "--findings", str(findings)]) == 1 + assert "FAIL: 1 blocking code-vuln finding(s)" in capsys.readouterr().out + + assert ( + main( + [ + "gate", + "--category", + "code-vuln", + "--findings", + str(findings), + "--max", + "1", + ] + ) + == 0 + ) + + assert ( + main( + [ + "evaluate", + "--category", + "code-vuln", + "--findings", + str(findings), + "--expected", + str(expected), + ] + ) + == 0 + ) + captured = capsys.readouterr().out + assert "Code Vulnerability Evaluation Report" in captured + assert "True positives: 1" in captured + assert "Precision: 1.0000" in captured + + +def test_scan_vuln_cli_delegates_to_runtime(monkeypatch, tmp_path, capsys): + calls = [] + output = tmp_path / "vuln-findings.jsonl" + + class Result: + output_path = output + sarif_path = tmp_path / "scan.sarif" + finding_count = 2 + + def fake_run(request): + calls.append(request) + return Result() + + monkeypatch.setattr( + "security_scanner.cli.commands.vulnerability.run_vulnerability_scan", + fake_run, + ) + + assert ( + main( + [ + "scan-vuln", + "--root", + "synthetic-repo", + "--semgrep-config", + "auto", + "-o", + str(output), + ] + ) + == 0 + ) + + assert calls[0].root == "synthetic-repo" + assert calls[0].output_path == str(output) + assert calls[0].path_policy == "redacted" + assert "imported 2 code-vuln finding(s)" in capsys.readouterr().out + + +def test_code_vuln_category_rejects_dynamodb_storage(capsys): + assert ( + main( + [ + "report", + "--category", + "code-vuln", + "--storage-backend", + "dynamodb", + "--dynamodb-table", + "SecurityScannerLocal", + ] + ) + == 2 + ) + assert "JSONL artifacts only" in capsys.readouterr().err diff --git a/tests/test_sarif_importer.py b/tests/test_sarif_importer.py new file mode 100644 index 0000000..e530168 --- /dev/null +++ b/tests/test_sarif_importer.py @@ -0,0 +1,210 @@ +from __future__ import annotations + +import json + +import pytest + +from security_scanner.core.vulnerability.report import render_vulnerability_report +from security_scanner.core.vulnerability.sarif import ( + SarifImportError, + import_sarif_payload, +) +from security_scanner.llm.vulnerability import build_vulnerability_prompt + + +def _sarif_payload() -> dict: + return { + "version": "2.1.0", + "runs": [ + { + "tool": { + "driver": { + "name": "Semgrep OSS", + "semanticVersion": "1.2.3", + "rules": [ + { + "id": "python.lang.security.audit.sql-injection", + "name": "SQL injection", + "shortDescription": {"text": "SQL injection"}, + "helpUri": "https://example.invalid/rules/sql", + "help": { + "markdown": "Use parameterized queries." + }, + "properties": { + "precision": "high", + "security-severity": "8.2", + "tags": [ + "security", + "external/cwe/cwe-089", + "OWASP:A03", + ], + }, + } + ], + } + }, + "results": [ + { + "ruleId": "python.lang.security.audit.sql-injection", + "level": "error", + "message": {"text": "Potential SQL injection."}, + "locations": [ + { + "physicalLocation": { + "artifactLocation": {"uri": "src/app.py"}, + "region": { + "startLine": 12, + "endLine": 12, + "snippet": { + "text": "cursor.execute(user_input)" + }, + }, + } + } + ], + "relatedLocations": [ + { + "physicalLocation": { + "artifactLocation": {"uri": "src/db.py"}, + "region": {"startLine": 4}, + } + } + ], + "codeFlows": [{"threadFlows": []}], + "partialFingerprints": { + "primaryLocationLineHash": "abc123" + }, + } + ], + } + ], + } + + +def test_import_sarif_payload_normalizes_rule_location_and_metadata(): + findings = import_sarif_payload(_sarif_payload()) + + assert len(findings) == 1 + finding = findings[0] + assert finding.source_tool == "Semgrep OSS" + assert finding.source_tool_version == "1.2.3" + assert finding.rule_id == "python.lang.security.audit.sql-injection" + assert finding.rule_name == "SQL injection" + assert finding.message == "Potential SQL injection." + assert finding.severity == "HIGH" + assert finding.precision == "HIGH" + assert finding.security_severity == 8.2 + assert finding.cwe_ids == ("CWE-89",) + assert finding.owasp_tags == ("OWASP:A03",) + assert finding.primary_location.file_path == "src/app.py" + assert finding.primary_location.line_start == 12 + assert finding.related_locations[0].file_path == "src/db.py" + assert finding.code_flow_count == 1 + assert finding.help_uri == "https://" + assert finding.help_markdown == "Use parameterized queries." + + +def test_import_sarif_payload_does_not_persist_snippet_text(): + rendered = json.dumps( + [finding.to_dict() for finding in import_sarif_payload(_sarif_payload())] + ) + + assert "cursor.execute" not in rendered + assert "snippet" not in rendered.lower() + + +def test_import_sarif_payload_sanitizes_untrusted_free_text(): + payload = _sarif_payload() + rule = payload["runs"][0]["tool"]["driver"]["rules"][0] + result = payload["runs"][0]["results"][0] + rule["helpUri"] = "https://internal.example.local/security/src/app.py" + rule["help"]["markdown"] = ( + "Avoid ```cursor.execute(user_input)``` and see " + "/synthetic/source-root/synthetic-repo/src/app.py" + ) + result["message"]["text"] = ( + "Observed `cursor.execute(user_input)` in " + "synthetic-org/synthetic-repo/src/app.py" + ) + + finding = import_sarif_payload(payload)[0] + rendered = json.dumps(finding.to_dict()) + + assert "cursor.execute" not in rendered + assert "synthetic-org" not in rendered + assert "synthetic-repo" not in rendered + assert "internal.example.local" not in rendered + assert "" in rendered + assert "" in rendered + assert finding.help_uri == "https://" + + +def test_import_sarif_payload_sanitizes_untrusted_metadata_surfaces(): + payload = _sarif_payload() + malicious_rule_id = ( + "synthetic.rules/synthetic-repo/src/app.py " + "`cursor.execute(user_input)`" + ) + payload["runs"][0]["tool"]["driver"]["name"] = ( + "Tool /synthetic/source-root/runner api_key=not-a-real-value" + ) + payload["runs"][0]["tool"]["driver"]["semanticVersion"] = ( + "1.2.3 /synthetic/source-root/version.txt" + ) + rule = payload["runs"][0]["tool"]["driver"]["rules"][0] + rule["id"] = malicious_rule_id + rule["name"] = "Rule cursor.execute(user_input) /synthetic/source-root/rule.py" + rule["properties"]["tags"] = [ + "security", + "external/cwe/cwe-089", + "OWASP:/synthetic/source-root/top10.yml", + ] + result = payload["runs"][0]["results"][0] + result["ruleId"] = malicious_rule_id + result["partialFingerprints"] = { + "synthetic/source-root/key.txt": "/synthetic/source-root/value.txt" + } + + finding = import_sarif_payload(payload)[0] + rendered = json.dumps(finding.to_dict()) + report = render_vulnerability_report([finding]) + prompt = build_vulnerability_prompt(finding) + combined = "\n".join([rendered, report, prompt]) + + assert "cursor.execute" not in combined + assert "source-root" not in combined + assert "not-a-real-value" not in combined + assert "synthetic-repo" not in combined + assert "" in combined + assert "" in combined + assert finding.partial_fingerprints == { + "": finding.partial_fingerprints[""] + } + assert finding.partial_fingerprints[""].startswith("sha256:") + + +def test_import_sarif_payload_redacts_absolute_paths(): + payload = _sarif_payload() + payload["runs"][0]["results"][0]["locations"][0]["physicalLocation"][ + "artifactLocation" + ]["uri"] = "/synthetic/source-root/service/src/app.py" + + finding = import_sarif_payload(payload)[0] + + assert finding.primary_location.path_kind == "absolute-redacted" + assert "/synthetic/source-root" not in finding.primary_location.file_path + assert "service" not in finding.primary_location.file_path + + +def test_import_sarif_payload_can_hash_all_paths_for_private_proof(): + finding = import_sarif_payload(_sarif_payload(), path_policy="redacted")[0] + + assert finding.primary_location.path_kind == "relative-redacted" + assert "src/app.py" not in finding.primary_location.file_path + assert finding.related_locations[0].path_kind == "relative-redacted" + assert "src/db.py" not in finding.related_locations[0].file_path + + +def test_import_sarif_payload_requires_runs_list(): + with pytest.raises(SarifImportError, match="runs list"): + import_sarif_payload({"version": "2.1.0"}) diff --git a/tests/test_semgrep_compatible_adapter.py b/tests/test_semgrep_compatible_adapter.py new file mode 100644 index 0000000..5c4ac70 --- /dev/null +++ b/tests/test_semgrep_compatible_adapter.py @@ -0,0 +1,75 @@ +from __future__ import annotations + +import subprocess + +import pytest + +from security_scanner.scanners.semgrep_compatible import ( + SemgrepCompatibleRunner, + SemgrepExecutionError, +) + + +def test_semgrep_runner_builds_sarif_command(tmp_path): + root = tmp_path / "repo" + sarif = tmp_path / "out.sarif" + + cmd = SemgrepCompatibleRunner().build_command(root, sarif_path=sarif) + + assert cmd == [ + "semgrep", + "scan", + "--config", + "auto", + "--metrics=off", + "--sarif", + f"--sarif-output={sarif}", + str(root), + ] + + +def test_semgrep_runner_raises_when_binary_missing(monkeypatch, tmp_path): + def fake_run(*args, **kwargs): + raise FileNotFoundError("missing") + + monkeypatch.setattr(subprocess, "run", fake_run) + + with pytest.raises(SemgrepExecutionError, match="binary not found"): + SemgrepCompatibleRunner().run(tmp_path, sarif_path=tmp_path / "out.sarif") + + +def test_semgrep_runner_returns_written_sarif(monkeypatch, tmp_path): + sarif = tmp_path / "out.sarif" + + def fake_run(*args, **kwargs): + sarif.write_text('{"version":"2.1.0","runs":[]}', encoding="utf-8") + return subprocess.CompletedProcess(args=args[0], returncode=0) + + monkeypatch.setattr(subprocess, "run", fake_run) + + assert SemgrepCompatibleRunner().run(tmp_path, sarif_path=sarif) == ( + '{"version":"2.1.0","runs":[]}' + ) + + +def test_semgrep_runner_sanitizes_failure_output(monkeypatch, tmp_path): + def fake_run(*args, **kwargs): + return subprocess.CompletedProcess( + args=args[0], + returncode=2, + stderr=( + "failed /synthetic/source-root/service/src/app.py " + "token=AKIAFAKEEXAMPLE000001" + ), + ) + + monkeypatch.setattr(subprocess, "run", fake_run) + + with pytest.raises(SemgrepExecutionError) as exc_info: + SemgrepCompatibleRunner().run(tmp_path, sarif_path=tmp_path / "out.sarif") + + message = str(exc_info.value) + assert "/synthetic/source-root" not in message + assert "service/src/app.py" not in message + assert "AKIAFAKEEXAMPLE000001" not in message + assert "" in message diff --git a/tests/test_vulnerability_finding.py b/tests/test_vulnerability_finding.py new file mode 100644 index 0000000..e4e2ef6 --- /dev/null +++ b/tests/test_vulnerability_finding.py @@ -0,0 +1,147 @@ +from __future__ import annotations + +import json + +import pytest + +from security_scanner.core.vulnerability.model import ( + VULN_CATEGORY, + VulnerabilityFinding, + VulnerabilityLocation, + compute_vulnerability_finding_id, + location_with_region, +) +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + + +def _finding(**overrides) -> VulnerabilityFinding: + defaults = dict( + finding_id="vuln_1234567890abcdef", + source_tool="semgrep", + source_tool_version="1.2.3", + rule_id="python.lang.security.audit.sql-injection", + rule_name="SQL injection", + message="Potential SQL injection.", + severity="HIGH", + precision="HIGH", + security_severity=8.1, + cwe_ids=("CWE-89",), + owasp_tags=("OWASP:A03",), + primary_location=VulnerabilityLocation( + file_path="src/app.py", + line_start=12, + line_end=12, + ), + partial_fingerprints={"primaryLocationLineHash": "abc123"}, + help_uri="https://example.invalid/rules/sql-injection", + help_markdown="Use parameterized queries.", + ) + defaults.update(overrides) + return VulnerabilityFinding(**defaults) + + +def test_vulnerability_finding_round_trips_without_secret_fields(): + finding = _finding() + + data = finding.to_dict() + rendered = json.dumps(data) + + assert data["category"] == VULN_CATEGORY + assert "secretHash" not in rendered + assert "rawSecret" not in rendered + assert "snippet" not in rendered.lower() + assert VulnerabilityFinding.from_dict(data) == finding + + +def test_vulnerability_finding_id_prefers_partial_fingerprint(): + with_fp = compute_vulnerability_finding_id( + source_tool="semgrep", + rule_id="rule.one", + partial_fingerprints={"primaryLocationLineHash": "same"}, + file_path="src/a.py", + line_start=1, + message="first", + ) + moved = compute_vulnerability_finding_id( + source_tool="semgrep", + rule_id="rule.one", + partial_fingerprints={"primaryLocationLineHash": "same"}, + file_path="src/b.py", + line_start=200, + message="second", + ) + + assert with_fp == moved + assert with_fp.startswith("vuln_") + + +def test_vulnerability_finding_id_falls_back_to_location_message(): + first = compute_vulnerability_finding_id( + source_tool="semgrep", + rule_id="rule.one", + partial_fingerprints={}, + file_path="src/a.py", + line_start=1, + message="first", + ) + second = compute_vulnerability_finding_id( + source_tool="semgrep", + rule_id="rule.one", + partial_fingerprints={}, + file_path="src/a.py", + line_start=2, + message="first", + ) + + assert first != second + + +def test_absolute_and_escaping_paths_are_redacted(): + absolute = location_with_region( + "/synthetic/source-root/service/src/app.py", + start_line=2, + ) + escaping = location_with_region("../synthetic-repo/src/app.py", start_line=2) + + assert absolute.path_kind == "absolute-redacted" + assert "/synthetic/source-root" not in absolute.file_path + assert "service/src/app.py" not in absolute.file_path + assert escaping.path_kind == "relative-redacted" + assert ".." not in escaping.file_path + + +def test_vulnerability_jsonl_store_round_trips_schema_records(tmp_path): + path = tmp_path / "vuln-findings.jsonl" + findings = [_finding(finding_id="vuln_b"), _finding(finding_id="vuln_a")] + + store = VulnerabilityJsonlStore(path) + store.write_all(findings) + + lines = path.read_text(encoding="utf-8").splitlines() + assert json.loads(lines[0])["entityType"] == "VULN_FINDING" + assert json.loads(lines[0])["schemaVersion"] == 1 + assert [item.finding_id for item in store.read_all()] == ["vuln_a", "vuln_b"] + + +def test_vulnerability_jsonl_store_rejects_wrong_entity_type(tmp_path): + path = tmp_path / "vuln-findings.jsonl" + path.write_text( + json.dumps({"entityType": "FINDING", "schemaVersion": 1, "finding": {}}), + encoding="utf-8", + ) + + with pytest.raises(ValueError, match="entityType VULN_FINDING"): + VulnerabilityJsonlStore(path).read_all() + + +def test_vulnerability_jsonl_store_rejects_wrong_nested_category(tmp_path): + path = tmp_path / "vuln-findings.jsonl" + record = { + "entityType": "VULN_FINDING", + "schemaVersion": 1, + "finding": {**_finding().to_dict(), "category": "secret"}, + } + path.write_text(json.dumps(record), encoding="utf-8") + + with pytest.raises(ValueError, match="expected category code-vuln"): + VulnerabilityJsonlStore(path).read_all() diff --git a/tests/test_vulnerability_scan_runtime.py b/tests/test_vulnerability_scan_runtime.py new file mode 100644 index 0000000..b2ab51d --- /dev/null +++ b/tests/test_vulnerability_scan_runtime.py @@ -0,0 +1,78 @@ +from __future__ import annotations + +import json + +from security_scanner.runtime.vulnerability_scan import ( + ScanVulnerabilityRequest, + run_vulnerability_scan, +) +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + + +def _sarif_payload() -> dict: + return { + "version": "2.1.0", + "runs": [ + { + "tool": { + "driver": { + "name": "Semgrep OSS", + "rules": [ + { + "id": "python.lang.security.audit.sql-injection", + "properties": { + "precision": "high", + "security-severity": "8.2", + }, + } + ], + } + }, + "results": [ + { + "ruleId": "python.lang.security.audit.sql-injection", + "message": {"text": "Potential SQL injection."}, + "locations": [ + { + "physicalLocation": { + "artifactLocation": {"uri": "src/app.py"}, + "region": {"startLine": 12}, + } + } + ], + } + ], + } + ], + } + + +def test_vulnerability_scan_runtime_routes_fake_runner_sarif_through_importer(tmp_path): + root = tmp_path / "repo" + root.mkdir() + output = tmp_path / "vuln-findings.jsonl" + sarif = tmp_path / "scan.sarif" + + class FakeRunner: + def run(self, root_path, *, sarif_path): + assert root_path == root + assert sarif_path == sarif + sarif_path.write_text(json.dumps(_sarif_payload()), encoding="utf-8") + return sarif_path.read_text(encoding="utf-8") + + result = run_vulnerability_scan( + ScanVulnerabilityRequest( + root=root, + output_path=output, + sarif_output_path=sarif, + ), + runner=FakeRunner(), + ) + + findings = VulnerabilityJsonlStore(output).read_all() + assert result.finding_count == 1 + assert result.sarif_path == sarif + assert findings[0].source_tool == "Semgrep OSS" + assert findings[0].rule_id == "python.lang.security.audit.sql-injection" + assert findings[0].primary_location.path_kind == "relative-redacted" + assert "src/app.py" not in findings[0].primary_location.file_path diff --git a/tests/test_vulnerability_verifier.py b/tests/test_vulnerability_verifier.py new file mode 100644 index 0000000..30e38e1 --- /dev/null +++ b/tests/test_vulnerability_verifier.py @@ -0,0 +1,196 @@ +from __future__ import annotations + +import json + +from security_scanner.core.vulnerability.model import ( + VulnerabilityFinding, + VulnerabilityLocation, +) +from security_scanner.llm.common.verifier import VerifierConfig +from security_scanner.llm.vulnerability import ( + VulnerabilityOllamaVerifier, + VulnerabilityVerifierResult, + apply_vulnerability_verifier_result, + build_vulnerability_prompt, + parse_vulnerability_verifier_response, +) +from security_scanner.runtime.vulnerability_verify_artifact import ( + VerifyVulnerabilityArtifactRequest, + run_verify_vulnerability_artifact, +) +from security_scanner.storage.vulnerability_jsonl_store import VulnerabilityJsonlStore + +SYNTHETIC_ABSOLUTE_PATH = "/synthetic/source-root/service/src/app.py" + + +def _finding(**overrides) -> VulnerabilityFinding: + defaults = dict( + finding_id="vuln_1234567890abcdef", + source_tool="semgrep", + rule_id="python.lang.security.audit.sql-injection", + rule_name="SQL injection", + message="Potential SQL injection.", + severity="HIGH", + precision="HIGH", + cwe_ids=("CWE-89",), + primary_location=VulnerabilityLocation( + file_path=SYNTHETIC_ABSOLUTE_PATH, + line_start=12, + path_kind="absolute-redacted", + ), + code_flow_count=1, + help_markdown="Use parameterized queries.", + ) + defaults.update(overrides) + return VulnerabilityFinding(**defaults) + + +def test_vulnerability_prompt_excludes_private_path_and_snippets(): + prompt = build_vulnerability_prompt(_finding()) + + assert SYNTHETIC_ABSOLUTE_PATH not in prompt + assert "source-root" not in prompt + assert "service/src/app.py" not in prompt + assert "raw source snippets" in prompt + assert "CWE-89" in prompt + assert "pathFingerprint" in prompt + + +def test_vulnerability_prompt_resanitizes_rule_help(): + prompt = build_vulnerability_prompt( + _finding( + help_markdown=( + "Avoid `cursor.execute(user_input)` in " + "/synthetic/source-root/synthetic-repo/src/app.py" + ) + ) + ) + + assert "cursor.execute" not in prompt + assert "source-root" not in prompt + assert "synthetic-repo" not in prompt + assert "" in prompt + assert "" in prompt + + +def test_vulnerability_response_parsing_fail_closed(): + valid = parse_vulnerability_verifier_response( + json.dumps( + { + "label": "true_positive", + "confidence": 0.91, + "reason": "Rule metadata indicates exploitable input flow.", + "remediation": "Use parameterized queries.", + } + ), + min_confidence=0.60, + ) + low = parse_vulnerability_verifier_response( + json.dumps( + { + "label": "false_positive", + "confidence": 0.10, + "reason": "weak", + "remediation": "none", + } + ), + min_confidence=0.60, + ) + invalid = parse_vulnerability_verifier_response("not json", min_confidence=0.60) + + assert valid.verdict == "TRUE_POSITIVE" + assert valid.remediation == "Use parameterized queries." + assert low.verdict == "NEEDS_REVIEW" + assert invalid.verdict == "NEEDS_REVIEW" + + +def test_apply_vulnerability_verifier_result_sanitizes_reason_and_remediation(): + finding = _finding() + verified = apply_vulnerability_verifier_result( + finding, + VulnerabilityVerifierResult( + verdict="TRUE_POSITIVE", + confidence=0.88, + reason=f"Observed risky flow at {SYNTHETIC_ABSOLUTE_PATH}", + remediation=( + f"Patch {SYNTHETIC_ABSOLUTE_PATH} with parameterized queries." + ), + raw_label="true_positive", + ), + ) + rendered = json.dumps(verified.to_dict()) + + assert verified.finding_id == finding.finding_id + assert verified.triage_state == "TRUE_POSITIVE" + assert SYNTHETIC_ABSOLUTE_PATH not in rendered + assert "" in rendered + + +def test_vulnerability_ollama_verifier_uses_redacted_payload(): + captured = {} + + def fake_transport(payload, timeout): + captured["payload"] = payload + return json.dumps( + { + "message": { + "content": json.dumps( + { + "label": "needs_review", + "confidence": 0.71, + "reason": "Metadata lacks enough exploitability context.", + "remediation": "Review input validation.", + } + ) + } + } + ) + + verifier = VulnerabilityOllamaVerifier( + VerifierConfig(host="http://127.0.0.1:11434", model="lfm2.5-thinking"), + transport=fake_transport, + ) + + result = verifier.verify(_finding()) + + rendered = json.dumps(captured["payload"]) + assert SYNTHETIC_ABSOLUTE_PATH not in rendered + assert "source-root" not in rendered + remediation_schema = captured["payload"]["format"]["properties"]["remediation"] + assert remediation_schema["type"] == "string" + assert result.verdict == "NEEDS_REVIEW" + + +def test_vulnerability_verify_artifact_runtime_writes_verified_jsonl(tmp_path): + input_path = tmp_path / "vuln-findings.jsonl" + output_path = tmp_path / "verified-vuln-findings.jsonl" + VulnerabilityJsonlStore(input_path).write_all([_finding()]) + + class FakeVerifier: + def __init__(self, config): + self.config = config + + def verify(self, finding): + return VulnerabilityVerifierResult( + verdict="FALSE_POSITIVE", + confidence=0.93, + reason="Synthetic test fixture.", + remediation="No code change required.", + raw_label="false_positive", + ) + + result = run_verify_vulnerability_artifact( + VerifyVulnerabilityArtifactRequest( + findings_path=input_path, + output_path=output_path, + config=VerifierConfig( + host="http://127.0.0.1:11434", + model="lfm2.5-thinking", + ), + ), + verifier_factory=FakeVerifier, + ) + + verified = VulnerabilityJsonlStore(output_path).read_all() + assert result.finding_count == 1 + assert verified[0].triage_state == "FALSE_POSITIVE"