Skip to content

[ARCH] Replace flat registry with true graph data model (nodes + edges) #8

@Wolfvin

Description

@Wolfvin

Problem

CodeLens currently uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like: who calls this function across the entire codebase, what is the blast radius if I rename this class, are there circular dependency chains.

Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc, leading to duplicated logic.

Proposed Change

Adopt a proper node + edge graph model backed by SQLite:

  • Nodes: Function, Class, File, Module, Route, Type, Interface
  • Edges: CALLS, IMPORTS, DEFINES, INHERITS, IMPLEMENTS, USES_TYPE
  • Each edge stores: source_id, target_id, edge_type, file, line, confidence

Benefits for Agents

  • Single traversal primitive replaces all ad-hoc engine code
  • Accurate blast radius: agents know exactly what breaks before touching anything
  • Foundation for Cypher-like queries (see separate issue)

Migration Path

  1. Add nodes + edges tables alongside existing registry (non-breaking)
  2. Populate during scan pass
  3. Migrate engines one by one
  4. Deprecate flat registry after all engines migrated

Reference

codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureCore architecture changeenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions