Skip to content

perf: batch finding-state lookups in scan-all suppression gate #44

@pureliture

Description

@pureliture

Summary

PR #43 (#6 line-stable suppression) added a per-finding suppression gate in _run_verifier_disposition_writes (src/security_scanner/runtime/scan_all.py). For each finding it calls resolve_existing_disposition, which does a synchronous read_finding_state (get_item) and, on miss, a find_disposition_by_match_key (get_item). This is an N+1 access pattern.

Why it was deferred (not fixed in #43)

The gate guards an Ollama LLM verification call that costs seconds per finding; one extra get_item (~ms) is negligible, and on a hit the gate saves the LLM call entirely. So at current scale the N+1 is not a bottleneck.

Proposed optimization (future)

  • Pre-collect all finding_ids and batch-read FINDING_STATE via the existing _batch_read_finding_states (or a sibling) before the loop, matching in memory.
  • The match_key pointers use distinct PKs (MATCHKEY#<mk>), so they'd need a separate BatchGetItem over the computed keys — design how to fold both reads cleanly.

Acceptance

  • Suppression gate performs at most O(batches) round-trips instead of O(findings) for the state lookups.
  • Behavior (over-suppression guard, line-move inheritance, fail-safe on error) unchanged; existing suppression tests still pass.

Origin: PR #43 review thread (gemini-code-assist).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions