Problem
Currently CodeLens uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like:
- Who calls this function, across the entire codebase?
- What is the blast radius if I rename this class?
- Are there circular dependency chains?
Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc on top of the flat registry, leading to duplicated logic and inconsistent results.
Proposed Change
Adopt a proper node + edge graph model backed by SQLite:
Nodes: Function, Class, File, Module, Route, Type, Interface
Edges: CALLS, IMPORTS, DEFINES, INHERITS, IMPLEMENTS, USES_TYPE
Each edge stores: source_id, target_id, edge_type, file, line, confidence.
This enables:
- Unified traversal for trace/impact/circular/dead-code — all become graph queries
- Cypher-like query engine on top (see separate issue)
- Cross-file and eventually cross-repo analysis
Migration Path
- Add
nodes + edges tables alongside existing registry (non-breaking)
- Populate during scan pass
- Migrate engines one by one to use graph queries
- Deprecate flat registry once all engines migrated
Inspiration
codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.
Problem
Currently CodeLens uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like:
Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc on top of the flat registry, leading to duplicated logic and inconsistent results.
Proposed Change
Adopt a proper node + edge graph model backed by SQLite:
Each edge stores:
source_id,target_id,edge_type,file,line,confidence.This enables:
Migration Path
nodes+edgestables alongside existing registry (non-breaking)Inspiration
codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.