[codex] Implement Issue #6 CORE NoSQL schema split#11
Conversation
Co-Authored-By: Codex GPT-5 <noreply@openai.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the NoSQL database schema for security scan results by splitting findings into three distinct entities: FINDING (identity), FINDING_OBSERVATION (run-scoped snapshot), and FINDING_STATE (lifecycle state). It introduces deterministic occurrence keys for observations and ensures that manual triage states are not overwritten during subsequent scans. Feedback on these changes suggests optimizing the read_for_scan_run method to avoid an N+1 query bottleneck by batching state retrieval with DynamoDB's batch_get_item API, and wrapping the returned items in without_none to prevent writing None values as NULL attributes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Co-Authored-By: Codex GPT-5 <noreply@openai.com>
Co-Authored-By: Codex GPT-5 <noreply@openai.com>
요약
Issue #6의 DynamoDB-compatible 저장 구조를
CORE범위로만 정리했습니다.핵심은 기존 run-scoped
FINDINGrow를 세 역할로 분리하는 것입니다.FINDING: stable finding identityFINDING_OBSERVATION: scan run에서 관측된 finding snapshotFINDING_STATE: lifecycle / triage stateREPO_META와SCAN_RUN흐름은 유지했고,read_for_scan_run()/read_all()의 runtimeFinding복원 의미도 유지했습니다.구조 변경
flowchart LR R["REPO_META<br/>PK=REPO#repoKey<br/>SK=META"] S["SCAN_RUN<br/>PK=REPO#repoKey<br/>SK=SCAN_RUN#scanAtIso#scanRunId"] I["FINDING<br/>PK=FINDING#findingId<br/>SK=META"] O["FINDING_OBSERVATION<br/>PK=RUN#scanRunId<br/>SK=OBS#findingId#occurrenceKey"] T["FINDING_STATE<br/>PK=FINDING#findingId<br/>SK=STATE#GLOBAL"] R --> S S --> O O --> I I --> T O -. "read path에서 state overlay" .-> TRead path
sequenceDiagram participant Caller participant Store as DynamoDbCompatibleFindingStore participant DB as DynamoDB-compatible table Caller->>Store: read_for_scan_run(scanRunId) Store->>DB: Query PK=RUN#scanRunId, SK begins_with OBS# DB-->>Store: FINDING_OBSERVATION rows Store->>DB: BatchGetItem PK=FINDING#findingId, SK=STATE#GLOBAL DB-->>Store: FINDING_STATE rows Store-->>Caller: Finding snapshot + lifecycle state overlayManual triage 보호
Scan write는
FINDING_STATE를 조건부 생성만 합니다.이미 manual triage가 들어간 state row가 있으면 scan write가 verdict / verifier / reason을 덮어쓰지 않습니다.
flowchart TD A["scan write"] --> B["FINDING identity upsert"] A --> C["FINDING_OBSERVATION write"] A --> D{"FINDING_STATE exists?"} D -- "no" --> E["default STATE#GLOBAL create"] D -- "yes" --> F["preserve manual triage"]Review 반영
read_for_scan_run()의 per-finding state query를batch_get_item기반 100개 단위 batch read로 변경했습니다.finding_to_items()반환 item에without_none()을 적용해 top-levelNoneattribute write를 막았습니다.명시적 non-goals
이번 PR은 Issue #6 comment의
CORE-only경계만 구현합니다.FindingFingerprintMap추가 없음ScanRunQueryRows추가 없음PatternQueryRows추가 없음Artifactstable/item 추가 없음충돌 해소
origin/main의 scan target catalog helper와 이 PR의 Issue #6 occurrence/state helper가items.py에서 같은 위치에 들어와 충돌했습니다.둘 다 필요한 변경이라 아래처럼 병합했습니다.
scan_target_to_item()/scan_target_from_item()유지STATE_SCOPE_GLOBAL/occurrence_key_for_finding()유지finding_to_items()의FINDING,FINDING_OBSERVATION,FINDING_STATEsplit 유지검증
Post-review 기준으로 다시 확인했습니다.
PYTHONDONTWRITEBYTECODE=1 uv run pytest -p no:cacheprovider tests/test_dynamodb_compatible_store.py tests/test_nosql_db_adapter.py tests/test_scan_target_storage.py→ 43 passedPYTHONDONTWRITEBYTECODE=1 uv run pytest -p no:cacheprovider→ 350 passedgit diff --check origin/main...HEAD→ passedmanual_triage_protected=trueCloses #6