Skip to content

Shard REPO# GSI1 partition for hot-partition safety at cloud scale #23

@pureliture

Description

@pureliture

Follow-up split from #12 (see PR #22).

Problem

All per-repo entities share gsi1pk = REPO#<repo> on GSI1: FINDING, FINDING_OBSERVATION, FINDING_STATE, STATE_EVENT. At 500+ repos with many findings/commits, a single large repo's partition becomes a write/read hot partition. Pre-existing TODO at src/security_scanner/storage/adapters/nosql_db/store.py:101 (TARGET_LIST) flags the same class of issue.

residual_for_repo (added in PR #22) also reads the whole REPO#<repo> partition (now narrowed to begins_with(gsi1sk,"RUN#"), but still per-partition).

Constraint

Sharding REPO#<repo>#<shard> breaks cross-repo / cross-shard time-ordered queries. A fix must add an alternate time-axis GSI (or scatter-gather across shards) for the queries that currently rely on a single partition.

Scope

  • Shard key design for REPO# partition entities.
  • Preserve: per-repo residual derivation, observation/state reads, REF_STATE listing.
  • Migration/back-compat for existing single-partition rows.

Out of scope

Severity: not blocking current local/MVP scale (review judged "acceptable for current scale"); needed before cloud-scale rollout.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions