Follow-up split from #12 (see PR #22).
Problem
All per-repo entities share gsi1pk = REPO#<repo> on GSI1: FINDING, FINDING_OBSERVATION, FINDING_STATE, STATE_EVENT. At 500+ repos with many findings/commits, a single large repo's partition becomes a write/read hot partition. Pre-existing TODO at src/security_scanner/storage/adapters/nosql_db/store.py:101 (TARGET_LIST) flags the same class of issue.
residual_for_repo (added in PR #22) also reads the whole REPO#<repo> partition (now narrowed to begins_with(gsi1sk,"RUN#"), but still per-partition).
Constraint
Sharding REPO#<repo>#<shard> breaks cross-repo / cross-shard time-ordered queries. A fix must add an alternate time-axis GSI (or scatter-gather across shards) for the queries that currently rely on a single partition.
Scope
- Shard key design for
REPO# partition entities.
- Preserve: per-repo residual derivation, observation/state reads, REF_STATE listing.
- Migration/back-compat for existing single-partition rows.
Out of scope
Severity: not blocking current local/MVP scale (review judged "acceptable for current scale"); needed before cloud-scale rollout.
Follow-up split from #12 (see PR #22).
Problem
All per-repo entities share
gsi1pk = REPO#<repo>on GSI1:FINDING,FINDING_OBSERVATION,FINDING_STATE,STATE_EVENT. At 500+ repos with many findings/commits, a single large repo's partition becomes a write/read hot partition. Pre-existing TODO atsrc/security_scanner/storage/adapters/nosql_db/store.py:101(TARGET_LIST) flags the same class of issue.residual_for_repo(added in PR #22) also reads the wholeREPO#<repo>partition (now narrowed tobegins_with(gsi1sk,"RUN#"), but still per-partition).Constraint
Sharding
REPO#<repo>#<shard>breaks cross-repo / cross-shard time-ordered queries. A fix must add an alternate time-axis GSI (or scatter-gather across shards) for the queries that currently rely on a single partition.Scope
REPO#partition entities.Out of scope
Severity: not blocking current local/MVP scale (review judged "acceptable for current scale"); needed before cloud-scale rollout.