Skip to content

fix(dataflow): chunk SQLite IN queries in collectCallerStitchCandidates to avoid SQLITE_MAX_VARIABLE_NUMBER limit #1613

@carlos-alm

Description

@carlos-alm

Problem

collectCallerStitchCandidates in src/features/dataflow.ts builds a single WHERE target_id IN (?, ?, ...) query over all changed function IDs. SQLite has a SQLITE_MAX_VARIABLE_NUMBER limit (999 on older builds, 32766 on SQLite ≥ 3.32). For repos with > 999 changed functions in a single rebuild, the query will throw SQLITE_ERROR: too many SQL variables.

Fix

Chunk changedFuncIds into batches of 500 and accumulate caller file results into a Set to deduplicate.

Also

Skip P4 (collectCallerStitchCandidates) entirely on full builds when fileSymbols covers all files in the DB — there can be no unchanged callers, so the pass is a no-op. Saves N per-file SELECTs on initial builds of large repos.

Context

Found while implementing P5 B2-B5 dataflow rules (PR #1608 branch feat/dataflow-vertex-schema-p0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions