feat(sast): add SARIF-native code vulnerability path#47
Conversation
SARIF 기반 VULN_FINDING 모델과 JSONL 저장소, import-sarif/scan-vuln 명령, code-vuln report/gate/evaluate/verify 흐름을 추가한다. Semgrep-compatible adapter는 SARIF만 canonical importer로 넘기고, vulnerability verifier는 redacted metadata만 사용한다. Architecture review gate에서 발견된 SARIF metadata 누출 위험과 scan-vuln path policy 기본값을 수정하고 회귀 테스트를 추가했다. Co-Authored-By: Codex GPT-5 <noreply@openai.com>
| "Do not request or reveal raw source snippets, repository names, hosts, " | ||
| "or paths.", |
| "Return strict JSON only with keys: label, confidence, reason, " | ||
| "remediation.", |
There was a problem hiding this comment.
Code Review
This pull request implements Phase 2a SARIF-native SAST support as an opt-in extension (--category code-vuln), introducing a new VulnerabilityFinding domain model, a SARIF 2.1.0 importer with path redaction, JSONL artifact persistence, a Semgrep-compatible analyzer adapter, and a redacted-metadata-based LLM verifier. The reviewer's feedback focuses on enhancing robustness and error handling across these new modules. Key recommendations include wrapping JSON parsing, file reading, and model instantiation in defensive try...except blocks to prevent raw tracebacks, verifying the existence of the scan root directory, handling malformed UTF-8 characters during LLM response decoding, raising explicit errors for missing input files, and adding comprehensive unit tests for unexpected types in parsed JSON fields.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if int(record.get("schemaVersion", 0)) != VULN_SCHEMA_VERSION: | ||
| raise ValueError( | ||
| f"line {line_number}: unsupported schemaVersion " | ||
| f"{record.get('schemaVersion')}" | ||
| ) | ||
| finding = record.get("finding") | ||
| if not isinstance(finding, dict): | ||
| raise ValueError(f"line {line_number}: finding must be an object") | ||
| if finding.get("category", VULN_CATEGORY) != VULN_CATEGORY: | ||
| raise ValueError(f"line {line_number}: expected category {VULN_CATEGORY}") | ||
| return VulnerabilityFinding.from_dict(finding) |
There was a problem hiding this comment.
Wrap the parsing of schemaVersion and the instantiation of VulnerabilityFinding in a try...except block to catch KeyError, TypeError, and ValueError. This prevents unhandled exceptions from bubbling up to the CLI and causing raw tracebacks when encountering malformed JSONL records.
| if int(record.get("schemaVersion", 0)) != VULN_SCHEMA_VERSION: | |
| raise ValueError( | |
| f"line {line_number}: unsupported schemaVersion " | |
| f"{record.get('schemaVersion')}" | |
| ) | |
| finding = record.get("finding") | |
| if not isinstance(finding, dict): | |
| raise ValueError(f"line {line_number}: finding must be an object") | |
| if finding.get("category", VULN_CATEGORY) != VULN_CATEGORY: | |
| raise ValueError(f"line {line_number}: expected category {VULN_CATEGORY}") | |
| return VulnerabilityFinding.from_dict(finding) | |
| try: | |
| schema_version = int(record.get("schemaVersion") or 0) | |
| except (ValueError, TypeError): | |
| raise ValueError(f"line {line_number}: invalid schemaVersion") | |
| if schema_version != VULN_SCHEMA_VERSION: | |
| raise ValueError( | |
| f"line {line_number}: unsupported schemaVersion {schema_version}" | |
| ) | |
| finding = record.get("finding") | |
| if not isinstance(finding, dict): | |
| raise ValueError(f"line {line_number}: finding must be an object") | |
| if finding.get("category", VULN_CATEGORY) != VULN_CATEGORY: | |
| raise ValueError(f"line {line_number}: expected category {VULN_CATEGORY}") | |
| try: | |
| return VulnerabilityFinding.from_dict(finding) | |
| except (KeyError, TypeError, ValueError) as exc: | |
| raise ValueError(f"line {line_number}: invalid finding data: {exc}") from exc |
| data = json.loads(Path(path).read_text(encoding="utf-8")) | ||
| schema_version = int(data.get("schemaVersion", 1)) | ||
| if schema_version != 1: | ||
| raise ValueError( | ||
| f"unsupported vulnerability corpus schemaVersion: {schema_version}" | ||
| ) | ||
| raw = data.get("expectedFindings", []) | ||
| if not isinstance(raw, list): | ||
| raise ValueError("vulnerability corpus expectedFindings must be a list") | ||
| return [VulnerabilityExpectedFinding.from_dict(item) for item in raw] |
There was a problem hiding this comment.
Ensure robustness when loading expected findings by wrapping the file reading, JSON parsing, and item instantiation in defensive try...except blocks. This prevents raw TypeError, KeyError, or json.JSONDecodeError tracebacks from being shown to the user.
try:
content = Path(path).read_text(encoding="utf-8")
except OSError as exc:
raise ValueError(f"failed to read expected findings file: {exc}") from exc
try:
data = json.loads(content)
except json.JSONDecodeError as exc:
raise ValueError(f"invalid expected findings JSON: {exc}") from exc
if not isinstance(data, dict):
raise ValueError("expected findings must be a JSON object")
try:
schema_version = int(data.get("schemaVersion") or 1)
except (ValueError, TypeError):
raise ValueError("invalid schemaVersion in expected findings")
if schema_version != 1:
raise ValueError(
f"unsupported vulnerability corpus schemaVersion: {schema_version}"
)
raw = data.get("expectedFindings", [])
if not isinstance(raw, list):
raise ValueError("vulnerability corpus expectedFindings must be a list")
findings = []
for idx, item in enumerate(raw, start=1):
if not isinstance(item, dict):
raise ValueError(f"expectedFindings item {idx} must be an object")
try:
findings.append(VulnerabilityExpectedFinding.from_dict(item))
except (KeyError, TypeError, ValueError) as exc:
raise ValueError(f"invalid expectedFindings item {idx}: {exc}") from exc
return findings| root = Path(root) | ||
| output = Path(sarif_path) |
There was a problem hiding this comment.
Verify that the root path exists and is a directory before executing the subprocess. If the directory does not exist, subprocess.run will raise a FileNotFoundError which would misleadingly be reported as the binary not being found.
root = Path(root)
if not root.is_dir():
raise SemgrepExecutionError(f"scan root directory does not exist: {root}")
output = Path(sarif_path)| method="POST", | ||
| ) | ||
| with urllib.request.urlopen(request, timeout=timeout_seconds) as response: # noqa: S310 | ||
| return response.read().decode("utf-8") |
There was a problem hiding this comment.
Use errors="replace" or errors="ignore" when decoding the HTTP response from the Ollama/LLM server. This prevents the verifier from crashing with a UnicodeDecodeError if the response contains malformed UTF-8 characters.
| return response.read().decode("utf-8") | |
| return response.read().decode("utf-8", errors="replace") |
| if input_path.resolve() == output_path.resolve(): | ||
| raise ValueError("--output must be different from --findings") | ||
| findings = VulnerabilityJsonlStore(input_path).read_all() |
There was a problem hiding this comment.
Raise a FileNotFoundError if the input findings file does not exist. Silently returning an empty list and creating an empty verified file can mask user errors such as typos in the input path.
if input_path.resolve() == output_path.resolve():
raise ValueError("--output must be different from --findings")
if not input_path.is_file():
raise FileNotFoundError(f"findings file not found: {input_path}")
findings = VulnerabilityJsonlStore(input_path).read_all()| try: | ||
| payload = json.loads(Path(path).read_text(encoding="utf-8")) | ||
| except json.JSONDecodeError as exc: | ||
| raise SarifImportError(f"invalid SARIF JSON: {exc.msg}") from exc |
There was a problem hiding this comment.
Wrap the file reading in a try...except OSError block to raise a SarifImportError (which inherits from ValueError) instead of letting FileNotFoundError or PermissionError bubble up. This ensures that the CLI commands can cleanly catch the error and print a user-friendly message instead of a raw traceback.
try:
content = Path(path).read_text(encoding="utf-8")
except OSError as exc:
raise SarifImportError(f"failed to read SARIF file: {exc}") from exc
try:
payload = json.loads(content)
except json.JSONDecodeError as exc:
raise SarifImportError(f"invalid SARIF JSON: {exc.msg}") from exc| def _safe_properties(properties: dict[str, object]) -> dict[str, object]: | ||
| return { | ||
| sanitize_vulnerability_identifier(key, fallback="property"): _safe_value(value) | ||
| for key, value in properties.items() | ||
| } |
There was a problem hiding this comment.
Defensively handle cases where properties is None or not a dictionary to prevent runtime AttributeError when calling .items().
| def _safe_properties(properties: dict[str, object]) -> dict[str, object]: | |
| return { | |
| sanitize_vulnerability_identifier(key, fallback="property"): _safe_value(value) | |
| for key, value in properties.items() | |
| } | |
| def _safe_properties(properties: dict[str, object] | None) -> dict[str, object]: | |
| if not isinstance(properties, dict): | |
| return {} | |
| return { | |
| sanitize_vulnerability_identifier(key, fallback="property"): _safe_value(value) | |
| for key, value in properties.items() | |
| } |
| def test_vulnerability_jsonl_store_rejects_wrong_nested_category(tmp_path): | ||
| path = tmp_path / "vuln-findings.jsonl" | ||
| record = { | ||
| "entityType": "VULN_FINDING", | ||
| "schemaVersion": 1, | ||
| "finding": {**_finding().to_dict(), "category": "secret"}, | ||
| } | ||
| path.write_text(json.dumps(record), encoding="utf-8") | ||
|
|
||
| with pytest.raises(ValueError, match="expected category code-vuln"): | ||
| VulnerabilityJsonlStore(path).read_all() |
There was a problem hiding this comment.
Add a comprehensive unit test covering unexpected types (such as None, numbers, booleans, lists, and dictionaries) for parsed JSON fields to ensure robust defensive type validation.
def test_vulnerability_jsonl_store_rejects_wrong_nested_category(tmp_path):
path = tmp_path / "vuln-findings.jsonl"
record = {
"entityType": "VULN_FINDING",
"schemaVersion": 1,
"finding": {**_finding().to_dict(), "category": "secret"},
}
path.write_text(json.dumps(record), encoding="utf-8")
with pytest.raises(ValueError, match="expected category code-vuln"):
VulnerabilityJsonlStore(path).read_all()
@pytest.mark.parametrize(
"invalid_value",
[None, 123, True, [], {}],
)
def test_vulnerability_finding_defensive_type_validation(tmp_path, invalid_value):
path = tmp_path / "vuln-findings.jsonl"
record = {
"entityType": "VULN_FINDING",
"schemaVersion": 1,
"finding": {**_finding().to_dict(), "findingId": invalid_value},
}
path.write_text(json.dumps(record), encoding="utf-8")
with pytest.raises(ValueError):
VulnerabilityJsonlStore(path).read_all()References
- When implementing defensive type validation for parsed JSON fields (e.g., verifying a field is a string), ensure robustness by adding comprehensive unit tests that cover various unexpected types, including None, numbers, booleans, lists, and dictionaries.
…-hardening fix(sast): harden vulnerability redaction to deny-by-default (PR #47 review)
Purpose & Motivation
Add the opt-in Phase 2a SARIF-native SAST product path so the scanner can normalize code vulnerability findings without changing the existing Gitleaks-first secret detection default.
No linked issue. This PR implements the approved long-goal scope from the in-thread Phase 2a research and architecture review packet.
Context
This keeps secret findings and
VULN_FINDINGseparate. SARIF is the canonical interchange contract, and analyzer execution is limited to a Semgrep-compatible subprocess adapter that emits SARIF and feeds the importer.scan-vulndefaults to redacted path policy for real local checkouts, whileimport-sarifkeeps synthetic paths by default for committed fixtures.LLM support is verifier/explainer only. The vulnerability verifier uses redacted metadata and fail-closed parsing; it does not detect vulnerabilities or generate concrete code patches.
Note
Architecture review gates were mandatory and completed at pre-implementation, post-M2, post-M3, and final. Final gates initially blocked on SARIF metadata leakage risk and
scan-vulnpath policy; both were fixed before this PR.Dependency
No new Python dependency is added. A Semgrep-compatible binary is only required when a user runs
scan-vuln; SARIF import/report/gate/evaluate tests do not require it.Verification
uv run pytest(783 passed)uv run ruff check ...on changed code/test pathsuv run python -m governance.render --validateuv run python -m governance.render --checkuv run python -m governance.rebuild_ledger_index --checkuv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --checkuv run python -m governance.public_safety --diff origin/main...HEADuv run python -m governance.public_safety --path ...on changed public artifactsuv run python -m governance.autopilot_gate --base origin/mainimport-sarif,report --category code-vuln,evaluate --category code-vuln,gate --category code-vuln --max 1Checklist