Skip to content

index_repository ignores .git/info/exclude (only .gitignore honored) → indexes excluded worktrees → OOM #489

@madebymlai

Description

@madebymlai

Summary

index_repository honors .gitignore but silently ignores .git/info/exclude (and, as far as I can tell, the global/core.excludesFile excludes too). Any directory excluded only via .git/info/exclude is walked and indexed.

This matters a lot for agent/sandbox workflows: tools like sandcastle put their worktrees under a dir that's excluded via .git/info/exclude (not a committed .gitignore). The indexer descends into those nested full-repo checkouts and the file count/graph explodes, leading to OOM during indexing.

Real-world trigger in my repo: 493 git-tracked files, but find walks 661,430 files / 170 GB because .sandcastle/worktrees/* (excluded only in .git/info/exclude) contains 3 nested full-repo worktrees, each ~57 GB with their own nested .sandcastle/ + run-data. The indexer treats git as the source of truth for some things but never consults .git/info/exclude, so it tries to ingest all of it.

Reproduction (minimal, deterministic)

T=$(mktemp -d)
cd "$T"
git init -q
mkdir gi_excluded ie_excluded included
echo 'def fn_gitignore_excluded(): return 1'  > gi_excluded/a.py
echo 'def fn_infoexclude_excluded(): return 2' > ie_excluded/b.py
echo 'def fn_plainly_included(): return 3'     > included/c.py
echo 'gi_excluded/' > .gitignore
echo 'ie_excluded/' >> .git/info/exclude     # git itself excludes this dir
git add -A && git commit -qm init

# git agrees BOTH dirs are excluded:
git check-ignore gi_excluded/a.py ie_excluded/b.py
#   gi_excluded/a.py
#   ie_excluded/b.py

codebase-memory-mcp cli index_repository "{\"repo_path\":\"$T\"}"

Actual behavior

Index output:

"excluded":{"dirs":["gi_excluded",".git"],"count":2,"truncated":false}

ie_excluded/ is not in the excluded set, and it gets indexed (pass.complexity functions=2). Confirming via search_graph:

dir excluded via in graph?
gi_excluded/ .gitignore ❌ correctly skipped
ie_excluded/ .git/info/exclude present — bug
included/ (nothing) ✅ present

Expected behavior

.git/info/exclude (and ideally core.excludesFile / the global excludes file) should be honored the same way .gitignore is, since git itself treats those paths as ignored. At minimum .git/info/exclude — it's the canonical place to exclude paths without committing a .gitignore entry, which is exactly what agent/sandbox tooling relies on.

Evidence the binary only reads .gitignore/.cbmignore

$ strings -n 5 codebase-memory-mcp | grep -iE 'info/exclude|excludesfile|gitignore|cbmignore'
%s/.cbmignore
%s/.gitignore

No reference to info/exclude, core.excludesFile, or a global gitignore path in the binary.

Impact

  • OOM on indexing for repos that use agent worktrees / sandboxes excluded via .git/info/exclude (memory scales with walked files; here 660k vs 493 tracked).
  • Surprising, silent: git status is clean, git check-ignore says the path is ignored, yet it's in the graph.

Related: #351 (worktree indexing), #234 (vendor/ indexed despite gitignore), #49/#58 (OOM on large trees).

Workaround

Add the path to a committed/real .gitignore (which is honored) instead of relying on .git/info/exclude. A .cbmignore entry also works.

Environment

  • codebase-memory-mcp 0.8.1, Linux x86-64
  • mem.init budget_mb=15936 total_ram_mb=31873

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions