Skip to content

feat(incremental): branch-aware residual + scan-worker daemon (#12)#22

Merged
pureliture merged 3 commits into
mainfrom
claude/issue-12-branch-residual
Jun 16, 2026
Merged

feat(incremental): branch-aware residual + scan-worker daemon (#12)#22
pureliture merged 3 commits into
mainfrom
claude/issue-12-branch-residual

Conversation

@pureliture

Copy link
Copy Markdown
Contributor

Summary

Closes the remaining issue #12 gaps on top of the PR #15 queue/worker MVP. Branch becomes a first-class occurrence dimension and per-branch residual is derived (status/disposition stays global).

What

  • branch populate: scan_worker tags Finding.repo.branch from ScanJob.ref_name; local_scan reads git HEAD context (_git_head_context); ScanRunSummary.branch recorded.
  • observation projection: FINDING_OBSERVATION items carry branch/commit top-level (queryable), not only in findingSnapshot.
  • per-branch residual: residual_by_branch / residual_for_repo derived within the REPO#<repo> partition (no new GSI). Matched on commit == last_seen_sha, so a commit at multiple ref tips is residual on every such branch.
  • scan-worker --daemon: polling loop with --poll-interval, idle-sleep/drain-fast, SIGINT/SIGTERM graceful shutdown. --once preserved.
  • rule_pack invalidation: changing rule_pack_version yields a new ledger key → rescan (verified by test).

Design (locked via grill-to-spec)

  • L1 status/disposition GLOBAL (STATE#GLOBAL); branch never enters finding identity (fingerprint).
  • L2 residual derived in REPO#<repo> partition; no new GSI.
  • L3 evaluate_gate unchanged/global; residual is report/observation visibility only.

Issue #12 criteria

Closes the previously-open 2 (worker polling daemon), 3 (rule_pack invalidation), 6 (per-branch residual end-to-end). Criteria 1/4/5/7 were already met on main.

Deferred (stated)

Tests

505 → 532 (+27), incl. multi-agent-review fixes: shared-commit residual, daemon dead-letter exit-2, run_local_scan e2e git context. ruff clean on new code.

Refs #12

🤖 Generated with Claude Code

Close issue #12 remaining gaps on top of the PR #15 queue/worker MVP:
- branch as occurrence: scan_worker tags Finding.repo.branch from
  ScanJob.ref_name; local_scan reads git HEAD context; ScanRunSummary.branch
  is recorded
- FINDING_OBSERVATION items project branch/commit top-level (queryable)
- per-branch residual derived within the REPO#<repo> partition (no new GSI):
  residual_by_branch / residual_for_repo, matched on commit==last_seen_sha so a
  commit at multiple ref tips is residual on every such branch
- scan-worker --daemon polling mode with SIGINT/SIGTERM graceful shutdown
- rule_pack_version change invalidates the commit ledger -> rescan (verified)

Design: status/disposition stays global (STATE#GLOBAL); branch never enters
finding identity (fingerprint); evaluate_gate is unchanged (residual is
report/observation visibility only).

Tests: 505 -> 532 (+27), incl. multi-agent-review fixes (shared-commit
residual, daemon dead-letter exit-2, local_scan e2e git context).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/security_scanner/runtime/branch_residual.py Fixed
Comment thread src/security_scanner/runtime/branch_residual.py Fixed

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces per-branch residual computation for incremental scanning and adds a polling daemon mode (--daemon) to the scan-worker CLI command. It updates both local scans and the scan worker to tag findings with branch and commit context, and extends the DynamoDB store to persist and query these occurrence dimensions. Feedback on the changes highlights three key improvement opportunities: optimizing the residual_by_branch calculation from O(N * M) to O(N + M) using a hash map, avoiding redundant serialization in finding_with_context when no updates are needed, and guarding signal.signal registration in _install_signal_shutdown to prevent runtime errors if executed from a background thread.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/security_scanner/runtime/branch_residual.py Outdated
Comment thread src/security_scanner/runtime/branch_residual.py
Comment thread src/security_scanner/cli/app.py
- residual_by_branch: index observations by commit (O(N+M) instead of O(N*M))
- finding_with_context: short-circuit when commit and branch are both None
- _install_signal_shutdown: skip signal.signal off the main thread (avoids
  ValueError when not on main thread)
- _ResidualStore Protocol: docstring stubs instead of `...` (CodeQL: statement
  has no effect)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pureliture

Copy link
Copy Markdown
Contributor Author

Addressed review feedback in dd39271:

  • O(N×M) → indexed observations by commit (O(N+M)) in residual_by_branch
  • finding_with_context: short-circuit when commit and branch are both None
  • _install_signal_shutdown: skip signal.signal off the main thread
  • _ResidualStore Protocol: docstring stubs instead of ... (CodeQL statement-has-no-effect)

Full suite 532 passed; ruff clean on changed code.

Surfaces per-branch residual findings (residual_for_repo) via a read-only
dynamodb-backed CLI, closing issue #12 criterion 6 visibility. The report
generator operates on a single scan run and lacks REF_STATE/cross-ref context,
so residual gets its own command (mirrors queue-status).

GSI sharding for hot-partition at cloud scale is split to #23.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pureliture

Copy link
Copy Markdown
Contributor Author

Added in 91faa5b: residual CLI subcommand (per-branch residual visibility, dynamodb-only, mirrors queue-status) — closes the previously-deferred report-UI item for #12 criterion 6. GSI sharding (hot-partition) split to #23. Suite 534 passed.

@pureliture pureliture merged commit fa1c29d into main Jun 16, 2026
8 checks passed
@pureliture pureliture deleted the claude/issue-12-branch-residual branch June 16, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants