Skip to content

[ARCH] Replace flat registry with true graph data model (nodes + edges) #7

Description

@Wolfvin

Problem

Currently CodeLens uses a flat registry (SQLite rows per symbol). This works for simple lookup but blocks structural queries like:

  • Who calls this function, across the entire codebase?
  • What is the blast radius if I rename this class?
  • Are there circular dependency chains?

Many existing engines (trace, impact, circular) reimplement partial graph traversal ad-hoc on top of the flat registry, leading to duplicated logic and inconsistent results.

Proposed Change

Adopt a proper node + edge graph model backed by SQLite:

Nodes: Function, Class, File, Module, Route, Type, Interface
Edges: CALLS, IMPORTS, DEFINES, INHERITS, IMPLEMENTS, USES_TYPE

Each edge stores: source_id, target_id, edge_type, file, line, confidence.

This enables:

  • Unified traversal for trace/impact/circular/dead-code — all become graph queries
  • Cypher-like query engine on top (see separate issue)
  • Cross-file and eventually cross-repo analysis

Migration Path

  1. Add nodes + edges tables alongside existing registry (non-breaking)
  2. Populate during scan pass
  3. Migrate engines one by one to use graph queries
  4. Deprecate flat registry once all engines migrated

Inspiration

codebase-memory-mcp uses this model and achieves <1ms query time on 4.8M nodes via SQLite with proper indexing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureCore architecture changeenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions