feat(incremental): branch-aware residual + scan-worker daemon (#12)#22
Conversation
Close issue #12 remaining gaps on top of the PR #15 queue/worker MVP: - branch as occurrence: scan_worker tags Finding.repo.branch from ScanJob.ref_name; local_scan reads git HEAD context; ScanRunSummary.branch is recorded - FINDING_OBSERVATION items project branch/commit top-level (queryable) - per-branch residual derived within the REPO#<repo> partition (no new GSI): residual_by_branch / residual_for_repo, matched on commit==last_seen_sha so a commit at multiple ref tips is residual on every such branch - scan-worker --daemon polling mode with SIGINT/SIGTERM graceful shutdown - rule_pack_version change invalidates the commit ledger -> rescan (verified) Design: status/disposition stays global (STATE#GLOBAL); branch never enters finding identity (fingerprint); evaluate_gate is unchanged (residual is report/observation visibility only). Tests: 505 -> 532 (+27), incl. multi-agent-review fixes (shared-commit residual, daemon dead-letter exit-2, local_scan e2e git context). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces per-branch residual computation for incremental scanning and adds a polling daemon mode (--daemon) to the scan-worker CLI command. It updates both local scans and the scan worker to tag findings with branch and commit context, and extends the DynamoDB store to persist and query these occurrence dimensions. Feedback on the changes highlights three key improvement opportunities: optimizing the residual_by_branch calculation from O(N * M) to O(N + M) using a hash map, avoiding redundant serialization in finding_with_context when no updates are needed, and guarding signal.signal registration in _install_signal_shutdown to prevent runtime errors if executed from a background thread.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
- residual_by_branch: index observations by commit (O(N+M) instead of O(N*M)) - finding_with_context: short-circuit when commit and branch are both None - _install_signal_shutdown: skip signal.signal off the main thread (avoids ValueError when not on main thread) - _ResidualStore Protocol: docstring stubs instead of `...` (CodeQL: statement has no effect) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed review feedback in dd39271:
Full suite 532 passed; ruff clean on changed code. |
Surfaces per-branch residual findings (residual_for_repo) via a read-only dynamodb-backed CLI, closing issue #12 criterion 6 visibility. The report generator operates on a single scan run and lacks REF_STATE/cross-ref context, so residual gets its own command (mirrors queue-status). GSI sharding for hot-partition at cloud scale is split to #23. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Closes the remaining issue #12 gaps on top of the PR #15 queue/worker MVP. Branch becomes a first-class occurrence dimension and per-branch residual is derived (status/disposition stays global).
What
scan_workertagsFinding.repo.branchfromScanJob.ref_name;local_scanreads git HEAD context (_git_head_context);ScanRunSummary.branchrecorded.FINDING_OBSERVATIONitems carrybranch/committop-level (queryable), not only infindingSnapshot.residual_by_branch/residual_for_repoderived within theREPO#<repo>partition (no new GSI). Matched oncommit == last_seen_sha, so a commit at multiple ref tips is residual on every such branch.--daemon: polling loop with--poll-interval, idle-sleep/drain-fast, SIGINT/SIGTERM graceful shutdown.--oncepreserved.rule_pack_versionyields a new ledger key → rescan (verified by test).Design (locked via grill-to-spec)
STATE#GLOBAL); branch never enters finding identity (fingerprint).REPO#<repo>partition; no new GSI.evaluate_gateunchanged/global; residual is report/observation visibility only.Issue #12 criteria
Closes the previously-open 2 (worker polling daemon), 3 (rule_pack invalidation), 6 (per-branch residual end-to-end). Criteria 1/4/5/7 were already met on main.
Deferred (stated)
residual_for_repo).Tests
505 → 532 (+27), incl. multi-agent-review fixes: shared-commit residual, daemon dead-letter exit-2,
run_local_scane2e git context.ruffclean on new code.Refs #12
🤖 Generated with Claude Code