[codex] Implement Issue #6 CORE NoSQL schema split by pureliture · Pull Request #11 · source-security-dev/security-scanner

pureliture · 2026-06-12T01:21:44Z

요약

Issue #6의 DynamoDB-compatible 저장 구조를 CORE 범위로만 정리했습니다.

핵심은 기존 run-scoped FINDING row를 세 역할로 분리하는 것입니다.

FINDING: stable finding identity
FINDING_OBSERVATION: scan run에서 관측된 finding snapshot
FINDING_STATE: lifecycle / triage state

REPO_META와 SCAN_RUN 흐름은 유지했고, read_for_scan_run() / read_all()의 runtime Finding 복원 의미도 유지했습니다.

구조 변경

flowchart LR
    R["REPO_META<br/>PK=REPO#repoKey<br/>SK=META"]
    S["SCAN_RUN<br/>PK=REPO#repoKey<br/>SK=SCAN_RUN#scanAtIso#scanRunId"]
    I["FINDING<br/>PK=FINDING#findingId<br/>SK=META"]
    O["FINDING_OBSERVATION<br/>PK=RUN#scanRunId<br/>SK=OBS#findingId#occurrenceKey"]
    T["FINDING_STATE<br/>PK=FINDING#findingId<br/>SK=STATE#GLOBAL"]

    R --> S
    S --> O
    O --> I
    I --> T
    O -. "read path에서 state overlay" .-> T

Read path

sequenceDiagram
    participant Caller
    participant Store as DynamoDbCompatibleFindingStore
    participant DB as DynamoDB-compatible table

    Caller->>Store: read_for_scan_run(scanRunId)
    Store->>DB: Query PK=RUN#scanRunId, SK begins_with OBS#
    DB-->>Store: FINDING_OBSERVATION rows
    Store->>DB: BatchGetItem PK=FINDING#findingId, SK=STATE#GLOBAL
    DB-->>Store: FINDING_STATE rows
    Store-->>Caller: Finding snapshot + lifecycle state overlay

Manual triage 보호

Scan write는 FINDING_STATE를 조건부 생성만 합니다.
이미 manual triage가 들어간 state row가 있으면 scan write가 verdict / verifier / reason을 덮어쓰지 않습니다.

flowchart TD
    A["scan write"] --> B["FINDING identity upsert"]
    A --> C["FINDING_OBSERVATION write"]
    A --> D{"FINDING_STATE exists?"}
    D -- "no" --> E["default STATE#GLOBAL create"]
    D -- "yes" --> F["preserve manual triage"]

Review 반영

read_for_scan_run()의 per-finding state query를 batch_get_item 기반 100개 단위 batch read로 변경했습니다.
finding_to_items() 반환 item에 without_none()을 적용해 top-level None attribute write를 막았습니다.

명시적 non-goals

이번 PR은 Issue #6 comment의 CORE-only 경계만 구현합니다.

FindingFingerprintMap 추가 없음
ScanRunQueryRows 추가 없음
PatternQueryRows 추가 없음
standalone Artifacts table/item 추가 없음
TTL / streams / Lambda / production DynamoDB behavior 추가 없음

충돌 해소

origin/main의 scan target catalog helper와 이 PR의 Issue #6 occurrence/state helper가 items.py에서 같은 위치에 들어와 충돌했습니다.
둘 다 필요한 변경이라 아래처럼 병합했습니다.

scan_target_to_item() / scan_target_from_item() 유지
STATE_SCOPE_GLOBAL / occurrence_key_for_finding() 유지
finding_to_items()의 FINDING, FINDING_OBSERVATION, FINDING_STATE split 유지

검증

Post-review 기준으로 다시 확인했습니다.

PYTHONDONTWRITEBYTECODE=1 uv run pytest -p no:cacheprovider tests/test_dynamodb_compatible_store.py tests/test_nosql_db_adapter.py tests/test_scan_target_storage.py → 43 passed
PYTHONDONTWRITEBYTECODE=1 uv run pytest -p no:cacheprovider → 350 passed
git diff --check origin/main...HEAD → passed
temporary Dynalite live CRUD smoke → passed
- table bootstrap
- write
- query
- scan
- batch state overlay
- conditional state protection
- cleanup
- manual_triage_protected=true

Closes #6

Co-Authored-By: Codex GPT-5 <noreply@openai.com>

gemini-code-assist

Code Review

This pull request refactors the NoSQL database schema for security scan results by splitting findings into three distinct entities: FINDING (identity), FINDING_OBSERVATION (run-scoped snapshot), and FINDING_STATE (lifecycle state). It introduces deterministic occurrence keys for observations and ensures that manual triage states are not overwritten during subsequent scans. Feedback on these changes suggests optimizing the read_for_scan_run method to avoid an N+1 query bottleneck by batching state retrieval with DynamoDB's batch_get_item API, and wrapping the returned items in without_none to prevent writing None values as NULL attributes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Co-Authored-By: Codex GPT-5 <noreply@openai.com>

Implement Issue #6 CORE NoSQL schema split

34980b0

Co-Authored-By: Codex GPT-5 <noreply@openai.com>

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread src/security_scanner/storage/adapters/nosql_db/store.py

Comment thread src/security_scanner/storage/adapters/nosql_db/items.py Outdated

Merge origin/main into Issue #6 schema branch

9cb60f3

Co-Authored-By: Codex GPT-5 <noreply@openai.com>

pureliture marked this pull request as ready for review June 12, 2026 01:32

Address PR review feedback for Issue #6 storage

aeb9e6c

Co-Authored-By: Codex GPT-5 <noreply@openai.com>

pureliture merged commit b57f135 into main Jun 12, 2026
2 checks passed

pureliture deleted the codex/issue-6-core-schema branch June 12, 2026 05:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Implement Issue #6 CORE NoSQL schema split#11

[codex] Implement Issue #6 CORE NoSQL schema split#11
pureliture merged 3 commits into
mainfrom
codex/issue-6-core-schema

pureliture commented Jun 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pureliture commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

요약

구조 변경

Read path

Manual triage 보호

Review 반영

명시적 non-goals

충돌 해소

검증

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pureliture commented Jun 12, 2026 •

edited

Loading