Problem
CodeLens currently uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like: who calls this function across the entire codebase, what is the blast radius if I rename this class, are there circular dependency chains.
Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc, leading to duplicated logic.
Proposed Change
Adopt a proper node + edge graph model backed by SQLite:
- Nodes: Function, Class, File, Module, Route, Type, Interface
- Edges: CALLS, IMPORTS, DEFINES, INHERITS, IMPLEMENTS, USES_TYPE
- Each edge stores: source_id, target_id, edge_type, file, line, confidence
Benefits for Agents
- Single traversal primitive replaces all ad-hoc engine code
- Accurate blast radius: agents know exactly what breaks before touching anything
- Foundation for Cypher-like queries (see separate issue)
Migration Path
- Add nodes + edges tables alongside existing registry (non-breaking)
- Populate during scan pass
- Migrate engines one by one
- Deprecate flat registry after all engines migrated
Reference
codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.
Problem
CodeLens currently uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like: who calls this function across the entire codebase, what is the blast radius if I rename this class, are there circular dependency chains.
Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc, leading to duplicated logic.
Proposed Change
Adopt a proper node + edge graph model backed by SQLite:
Benefits for Agents
Migration Path
Reference
codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.