Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CURRENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

- Project: `security-scanner`
- Merge mode: `guarded-auto-merge`
- Active goal: `review-assisted-autopilot`
- Active goal: `phase-2a-sarif-product-complete`
- Last auto merge: `ledger:20260617T003405Z-autopilot-3236f4`
- Ledger entries: `4`
- Ledger index hash: `sha256:e1893a649a1101b74a087b5eaaa275813a85708c5bb46c4ae70c24e10a111050`
Expand Down
46 changes: 45 additions & 1 deletion docs/views/research-and-technical-decisions.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,46 @@
| LLM verifier | Ollama-compatible adapter | Detector가 아니라 triage 보조층으로 제한하기 위해 |
| SAST/SCA expansion | 별도 확장 후보 | Secret Detection 실행 경로가 안정화되기 전 범위 확장을 막기 위해 |

## Phase 2a planning note

SAST는 현재 기본 지원 기능이 아니라 opt-in 확장입니다. GHAS-like vulnerability
scanning 리서치 후속으로 [Phase 2a SARIF-native SAST spec](../workbench/specs/phase-2a-sarif-native-sast/requirements.md)을
workbench에 기록했고, 기존 M1 import-first packet을 M1~M4 제품완성용
long-single-goal로 승격했습니다.

핵심 경계는 다음과 같습니다.

- 기존 Gitleaks-first Secret Detection 기본 경로와 `Finding` 모델은 유지합니다.
- Code vulnerability alert는 별도 `VULN_FINDING` / `VulnerabilityFinding`
계열로 검토합니다.
- Analyzer보다 SARIF-compatible contract를 먼저 고정하고 모든 analyzer output은 canonical importer를 통과시킵니다.
- 첫 실행 adapter는 Semgrep CE-compatible CLI를 기본 target으로 두되, Semgrep-compatible boundary로 유지합니다.
- `report`, `gate`, `evaluate`는 `--category code-vuln` 같은 explicit opt-in으로만 확장합니다.
- GHAS는 scan trigger나 alert mutation 없이 reference/read-only comparison 후보로만 둡니다.
- LLM은 detector가 아니라 verifier, explainer, generic remediation assistant입니다.
- Architecture review gate는 pre-implementation, post-M2, post-M3, final 단계의 필수 blocking check입니다.
- SCA/SBOM/dependency vulnerability는 별도 future track입니다.

Opt-in 사용 예시는 다음과 같습니다.

```bash
uv run security-scanner import-sarif \
--sarif examples/code-vuln-semgrep.example.sarif \
--output private/vuln-findings.jsonl

uv run security-scanner report --category code-vuln --findings private/vuln-findings.jsonl
uv run security-scanner gate --category code-vuln --findings private/vuln-findings.jsonl --max 0
uv run security-scanner evaluate \
--category code-vuln \
--expected eval/synthetic-code-vuln/expected-findings.example.json \
--findings private/vuln-findings.jsonl
```

`scan-vuln`은 실제 local checkout을 대상으로 하므로 기본적으로
`--path-policy redacted`를 사용해 상대 경로까지 hash합니다. Synthetic SARIF
fixture를 직접 넣는 `import-sarif`만 기본값이 `synthetic`이며, private proof에서
path까지 숨겨야 하면 `import-sarif --path-policy redacted`를 사용합니다.

## 도구별 역할

- Gitleaks는 secret 후보를 찾는 기준 도구입니다.
Expand All @@ -34,7 +74,11 @@

공개 문서는 tool role과 decision rationale만 설명합니다. 비공개 benchmark data, 민감한 alert data, internal repository context, private provider endpoint는 제외합니다.

## 노이즈 필터 위치 결정
## Secret Detection 노이즈 필터 위치 결정

이 결정은 Gitleaks-first Secret Detection 경로에만 적용합니다. Phase 2a
`code-vuln` SAST 경로의 normalization, suppression, triage 정책은 별도
`VULN_FINDING` contract에서 결정합니다.

| 필터 위치 후보 | 장점 | 단점 | 선택 여부 |
| --- | --- | --- | --- |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
# Agentic Workflow: Phase 2a SARIF-native SAST Product Completion

**Status:** Ready for long single-goal execution
**Date:** 2026-06-20
**Goal ID:** `phase-2a-sarif-product-complete`
**Spec:** `docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md`
**Design:** `docs/workbench/specs/phase-2a-sarif-native-sast/design.md`
**Merge flow:** pull request

이 문서는 기존 M1 import-first packet을 폐기하지 않고, 같은 실행 흐름을
M1~M4 제품완성 long-single-goal로 승격한 실행 패킷이다. 목표는 broad Phase 2a
전체가 아니라 `SARIF import -> VULN_FINDING artifact -> report/gate/evaluate ->
Semgrep-compatible analyzer -> vulnerability LLM verifier/explainer`까지의 countable
제품 slice를 끝까지 닫는 것이다.

## Goal

Phase 2a SARIF-native SAST 제품 slice를 PR/CI/merge까지 완성한다.

완료 기준:

- M1: Synthetic SARIF 2.1.0 fixtures를 deterministic public-safe `VULN_FINDING`
JSONL artifact로 정규화하는 `import-sarif` 흐름이 있다.
- M2: `report`, `gate`, `evaluate`가 기존 secret default를 유지하면서
`--category code-vuln` 또는 equivalent opt-in으로 `VULN_FINDING` artifact를 다룬다.
- M3: Semgrep-compatible analyzer adapter가 local checkout에서 SARIF를 생성하고,
기존 SARIF importer를 재사용한다. 기본 engine은 Semgrep CE-compatible CLI
(`semgrep`)로 둔다.
- M4: Vulnerability LLM verifier/explainer가 redacted rule/CWE/OWASP/location-shape
metadata만 받아 strict JSON/fail-closed review assistance를 만든다.
- 기존 Gitleaks-first Secret Detection default path는 변하지 않는다.
- GHAS scan trigger, SARIF upload, alert mutation, live fetch는 없다.
- Architecture review gate가 구현 전, M2/M3 이후, merge 전 final 단계에서 실행되고
blocking finding이 없어야 한다.
- PR CI와 local governance gate가 모두 통과한다.

## Execution Contract

- 단일 장기 goal로 M1~M4를 끝까지 진행한다.
- 중간 milestone 사용자 승인은 요구하지 않는다.
- 사람 개입은 stop condition 발생 시에만 요청한다.
- Subagent를 적극 사용한다. 구현 worker는 `gpt-5.5`, `reasoning_effort: high`;
보조 coding/review는 repo policy에 맞춘다.
- PR을 만들고 CI를 통과시킨 뒤 merge 가능 상태까지 닫는다.
- 실제 endpoint, host, credential, private path, real SARIF/GHAS export, real finding
output은 커밋하지 않는다.

## Fixed Decisions

- Scope: M1~M4 product-complete slice.
- Persistence: JSONL artifact-first. DynamoDB-compatible projection 또는 storage schema
migration은 이번 goal 밖이며, 필요해지는 순간 stop condition이다.
- Category wire value: `code-vuln`.
- Model boundary: `core/vulnerability` 또는 equivalent 별도 module.
- `finding_id`: SARIF `partialFingerprints` 우선, 없으면
`source_tool + rule_id + normalized synthetic path + start_line + message` stable hash.
- Path handling: committed fixture import는 synthetic-only path를 허용한다.
`scan-vuln` 실제 checkout scan은 기본 `redacted` path policy로 relative path까지
hash한다.
- Analyzer: Semgrep CE-compatible CLI를 기본 실행 target으로 둔다. Adapter boundary는
Semgrep-compatible으로 유지해 OpenGrep 교체 가능성을 남긴다.
- Gate policy: severity + precision threshold. 기본 fail 기준은 implementation 중
synthetic fixture와 SARIF metadata에 맞춰 보수적으로 정하되, secret `gate --max 0`
semantics와 섞지 않는다.
- LLM: detector가 아니라 verifier/explainer only. Strict JSON, confidence threshold,
fail-closed `NEEDS_REVIEW` behavior. Raw code snippet 금지.
- GHAS: out of scope. Live fetch, upload, mutation 금지.
- SCA/SBOM: out of scope.

## Required Architecture Review Gate

Architecture review is mandatory and blocking.

Required checkpoints:

1. **Pre-implementation architecture review**
- Review spec/design/workflow before code changes.
- Confirm M1~M4 scope, module seams, write surface, stop conditions.

2. **Post-M2 codebase architecture review**
- After `VULN_FINDING` model, artifact, report/gate/evaluate integration.
- Confirm secret `Finding` remains isolated and no default behavior changed.

3. **Post-M3 adapter architecture review**
- After Semgrep-compatible adapter wiring.
- Confirm adapter emits/consumes SARIF through the canonical importer and no
analyzer-specific JSON becomes the internal contract.

4. **Final architecture review**
- Before PR ready/merge.
- Confirm M1~M4 evidence, tests, public-safety, rollback boundaries, and docs.

Blocking architecture findings stop the run only if they require SoT change,
scope expansion, unsafe data handling, or existing secret workflow behavior change.
Otherwise, the implementing agent fixes them inside the same goal.

## Multi-agent Execution Model

Use subagents by disjoint responsibility. Main agent integrates results and owns
final judgment.

| Role | Responsibility | Write scope |
| --- | --- | --- |
| `system_architecture_manager` | Architecture gate, SoT drift, product/system boundary | read-only |
| `codebase_architecture_manager` | Codebase seam review, module locality, implementation architecture | read-only |
| Worker A | `VulnerabilityFinding`, SARIF importer, JSONL artifact | `src/security_scanner/core/vulnerability/**`, tests |
| Worker B | CLI and artifact report/gate/evaluate integration | `src/security_scanner/cli/**`, `src/security_scanner/core/report/**`, `src/security_scanner/core/policy/**`, tests |
| Worker C | Semgrep-compatible adapter | `src/security_scanner/scanners/**`, tests |
| Worker D | Vulnerability verifier/explainer | `src/security_scanner/llm/**`, tests |
| Reviewer | Public-safety/security review | read-only |
| `code_simplifier` | Final clarity/refactor pass preserving behavior | touched implementation files only |

Workers are not alone in the codebase. They must not revert other workers'
edits; they adapt to integrated changes.

## Allowed Write Surface

실행 agent는 `governance/autopilot_goal.yml`의 `allowed_writes`를 authoritative scope로
사용한다. 요약하면 다음 표면만 변경한다.

- `docs/workbench/specs/phase-2a-sarif-native-sast/**`
- `docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md`
- `docs/views/research-and-technical-decisions.md`
- `src/security_scanner/**`
- `tests/**`
- `examples/**`
- `eval/**`
- `governance/**`
- `ledger/**`
- `CURRENT.md`

이 범위를 벗어나는 변경이 필요하면 scope expansion으로 간주하고 멈춘다.

## Suggested Work Plan

### Readiness Gate

1. Read current contracts.
- `AGENTS.md`
- `governance/autopilot_goal.yml`
- this workflow document
- `docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md`
- `docs/workbench/specs/phase-2a-sarif-native-sast/design.md`
- `docs/workbench/specs/phase-2a-sarif-native-sast/review.md`
2. Run pre-implementation architecture review gate.
3. Confirm working tree isolation and allowed write surface.

### M1 SARIF import and artifact

1. Add failing tests first:
- `VulnerabilityFinding` model serializes without secret fields.
- SARIF importer handles minimal result, multiple rules, missing optional metadata,
`partialFingerprints`, and code flow count.
- JSONL writer/reader round-trips `VULN_FINDING` records deterministically.
- `import-sarif` writes JSONL artifact from synthetic fixture.
- `scan-vuln` defaults to `--path-policy redacted` for private-proof artifacts.
- `import-sarif --path-policy redacted` hashes relative paths for private SARIF
proof artifacts.
- Existing `scan`, `report`, `gate`, `evaluate` default tests still pass.
2. Implement:
- `core/vulnerability` or equivalent module.
- stdlib JSON SARIF importer.
- JSONL artifact writer/reader.
- opt-in `import-sarif` CLI.

### M2 Report, gate, evaluate

1. Add tests for `--category code-vuln` or equivalent opt-in.
2. Implement artifact-based report/gate/evaluate without changing secret defaults.
3. Gate uses severity + precision policy, isolated from secret `gate --max 0`.
4. Run post-M2 codebase architecture review gate.

### M3 Semgrep-compatible adapter

1. Add tests using fake command runner and synthetic output.
2. Implement Semgrep-compatible CLI adapter that emits SARIF.
3. Route analyzer output through the canonical SARIF importer.
4. Sanitize analyzer failure stdout/stderr before surfacing CLI errors.
5. Do not make analyzer-specific JSON the internal contract.
6. Run post-M3 adapter architecture review gate.

### M4 Vulnerability verifier/explainer

1. Add tests for redacted prompt/input:
- no raw source snippet
- no private path
- no real repo name
- strict JSON
- invalid output/timeout/low confidence => review-needed
2. Implement vulnerability-specific prompt/application adapter.
3. Reuse only shared strict JSON / confidence / fail-closed behavior from secret verifier.
4. Do not auto-dismiss, suppress, or patch code from LLM output.

### Finalization

1. Update docs with current-support language that stays opt-in/experimental.
2. Run final architecture review gate.
3. Run full checks.
4. Stage files by name, commit, push PR, wait for CI, merge when green.

## Required Local Checks

Run targeted checks by milestone:

```bash
uv run pytest tests/test_vulnerability_finding.py tests/test_sarif_importer.py tests/test_cli_import_sarif.py
uv run pytest tests/test_code_vuln_report_gate.py tests/test_vulnerability_evaluation.py
uv run pytest tests/test_semgrep_compatible_adapter.py
uv run pytest tests/test_vulnerability_verifier.py
```

Run full checks before PR creation and before marking the goal complete:

```bash
uv run pytest
uv run python -m governance.render --validate
uv run python -m governance.render --check
uv run python -m governance.rebuild_ledger_index --check
uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check
uv run python -m governance.public_safety --diff origin/main...HEAD
uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md
uv run python -m governance.autopilot_gate --base origin/main
```

Architecture gate evidence must be captured in the PR summary or a workbench
review note before merge.

## Stop Conditions

Stop and ask for human input only when one of these occurs.

- public-safety hit that cannot be resolved without deleting or redacting committed content
- required implementation path outside `allowed_writes`
- architecture review requires SoT change
- Semgrep-compatible adapter cannot be implemented without committing real scan output
- need to persist real private paths, raw source snippets, real SARIF, or real GHAS export
- GHAS trigger, upload, alert mutation, dismissal, or live fetch becomes necessary
- existing secret scan/report/gate/evaluate default would change
- storage projection or schema migration becomes necessary
- protected branch or PR permission failure
- same blocker repeats three times with no new evidence
- break-glass state becomes active

## Resume Prompt

Use this prompt to start or resume the long run.

```text
Goal: complete `phase-2a-sarif-product-complete` in the security-scanner repo through a PR.

Read first:
- AGENTS.md
- governance/autopilot_goal.yml
- docs/workbench/agentic-workflows/2026-06-20-phase-2a-sarif-import-first-goal.md
- docs/workbench/specs/phase-2a-sarif-native-sast/requirements.md
- docs/workbench/specs/phase-2a-sarif-native-sast/design.md
- docs/workbench/specs/phase-2a-sarif-native-sast/review.md
- src/security_scanner/core/finding/model.py
- src/security_scanner/storage/jsonl_store.py
- src/security_scanner/cli/app.py
- src/security_scanner/cli/commands/scan.py
- src/security_scanner/llm/common/verifier.py
- src/security_scanner/scanners/gitleaks/runner.py

Implement M1~M4 product slice:
M1 SARIF import -> VULN_FINDING JSONL artifact.
M2 report/gate/evaluate --category code-vuln from artifact.
M3 Semgrep-compatible adapter emitting SARIF through the canonical importer.
M4 vulnerability verifier/explainer using redacted metadata only.

Use multi-agent execution aggressively. Mandatory architecture gates:
pre-implementation, post-M2, post-M3, final. Do not ask for approval unless a
listed stop condition occurs. Keep existing Gitleaks-first secret defaults
unchanged. Do not call GHAS, upload SARIF, mutate alerts, persist raw snippets,
or commit real scan data. Finish by opening a PR, waiting for CI, and merging
when green.

Required checks:
- uv run pytest
- uv run python -m governance.render --validate
- uv run python -m governance.render --check
- uv run python -m governance.rebuild_ledger_index --check
- uv run python -m governance.render_github_ruleset --output governance/main_ruleset.json --check
- uv run python -m governance.public_safety --diff origin/main...HEAD
- uv run python -m governance.public_safety --path docs/workbench/specs/phase-2a-sarif-native-sast --path docs/views/research-and-technical-decisions.md
- uv run python -m governance.autopilot_gate --base origin/main
```
Loading
Loading