Summary
index_repository honors .gitignore but silently ignores .git/info/exclude (and, as far as I can tell, the global/core.excludesFile excludes too). Any directory excluded only via .git/info/exclude is walked and indexed.
This matters a lot for agent/sandbox workflows: tools like sandcastle put their worktrees under a dir that's excluded via .git/info/exclude (not a committed .gitignore). The indexer descends into those nested full-repo checkouts and the file count/graph explodes, leading to OOM during indexing.
Real-world trigger in my repo: 493 git-tracked files, but find walks 661,430 files / 170 GB because .sandcastle/worktrees/* (excluded only in .git/info/exclude) contains 3 nested full-repo worktrees, each ~57 GB with their own nested .sandcastle/ + run-data. The indexer treats git as the source of truth for some things but never consults .git/info/exclude, so it tries to ingest all of it.
Reproduction (minimal, deterministic)
T=$(mktemp -d)
cd "$T"
git init -q
mkdir gi_excluded ie_excluded included
echo 'def fn_gitignore_excluded(): return 1' > gi_excluded/a.py
echo 'def fn_infoexclude_excluded(): return 2' > ie_excluded/b.py
echo 'def fn_plainly_included(): return 3' > included/c.py
echo 'gi_excluded/' > .gitignore
echo 'ie_excluded/' >> .git/info/exclude # git itself excludes this dir
git add -A && git commit -qm init
# git agrees BOTH dirs are excluded:
git check-ignore gi_excluded/a.py ie_excluded/b.py
# gi_excluded/a.py
# ie_excluded/b.py
codebase-memory-mcp cli index_repository "{\"repo_path\":\"$T\"}"
Actual behavior
Index output:
"excluded":{"dirs":["gi_excluded",".git"],"count":2,"truncated":false}
ie_excluded/ is not in the excluded set, and it gets indexed (pass.complexity functions=2). Confirming via search_graph:
| dir |
excluded via |
in graph? |
gi_excluded/ |
.gitignore |
❌ correctly skipped |
ie_excluded/ |
.git/info/exclude |
✅ present — bug |
included/ |
(nothing) |
✅ present |
Expected behavior
.git/info/exclude (and ideally core.excludesFile / the global excludes file) should be honored the same way .gitignore is, since git itself treats those paths as ignored. At minimum .git/info/exclude — it's the canonical place to exclude paths without committing a .gitignore entry, which is exactly what agent/sandbox tooling relies on.
Evidence the binary only reads .gitignore/.cbmignore
$ strings -n 5 codebase-memory-mcp | grep -iE 'info/exclude|excludesfile|gitignore|cbmignore'
%s/.cbmignore
%s/.gitignore
No reference to info/exclude, core.excludesFile, or a global gitignore path in the binary.
Impact
- OOM on indexing for repos that use agent worktrees / sandboxes excluded via
.git/info/exclude (memory scales with walked files; here 660k vs 493 tracked).
- Surprising, silent:
git status is clean, git check-ignore says the path is ignored, yet it's in the graph.
Related: #351 (worktree indexing), #234 (vendor/ indexed despite gitignore), #49/#58 (OOM on large trees).
Workaround
Add the path to a committed/real .gitignore (which is honored) instead of relying on .git/info/exclude. A .cbmignore entry also works.
Environment
codebase-memory-mcp 0.8.1, Linux x86-64
mem.init budget_mb=15936 total_ram_mb=31873
Summary
index_repositoryhonors.gitignorebut silently ignores.git/info/exclude(and, as far as I can tell, the global/core.excludesFileexcludes too). Any directory excluded only via.git/info/excludeis walked and indexed.This matters a lot for agent/sandbox workflows: tools like sandcastle put their worktrees under a dir that's excluded via
.git/info/exclude(not a committed.gitignore). The indexer descends into those nested full-repo checkouts and the file count/graph explodes, leading to OOM during indexing.Real-world trigger in my repo: 493 git-tracked files, but
findwalks 661,430 files / 170 GB because.sandcastle/worktrees/*(excluded only in.git/info/exclude) contains 3 nested full-repo worktrees, each ~57 GB with their own nested.sandcastle/+ run-data. The indexer treatsgitas the source of truth for some things but never consults.git/info/exclude, so it tries to ingest all of it.Reproduction (minimal, deterministic)
Actual behavior
Index output:
ie_excluded/is not in the excluded set, and it gets indexed (pass.complexity functions=2). Confirming viasearch_graph:gi_excluded/.gitignoreie_excluded/.git/info/excludeincluded/Expected behavior
.git/info/exclude(and ideallycore.excludesFile/ the global excludes file) should be honored the same way.gitignoreis, sincegititself treats those paths as ignored. At minimum.git/info/exclude— it's the canonical place to exclude paths without committing a.gitignoreentry, which is exactly what agent/sandbox tooling relies on.Evidence the binary only reads
.gitignore/.cbmignoreNo reference to
info/exclude,core.excludesFile, or a global gitignore path in the binary.Impact
.git/info/exclude(memory scales with walked files; here 660k vs 493 tracked).git statusis clean,git check-ignoresays the path is ignored, yet it's in the graph.Related: #351 (worktree indexing), #234 (vendor/ indexed despite gitignore), #49/#58 (OOM on large trees).
Workaround
Add the path to a committed/real
.gitignore(which is honored) instead of relying on.git/info/exclude. A.cbmignoreentry also works.Environment
codebase-memory-mcp 0.8.1, Linux x86-64mem.init budget_mb=15936 total_ram_mb=31873