Summary
-------
Add a Neo4j backend to codeanalyzer-java so the analysis json can be emitted as a
property graph instead of (or in addition to) analysis.json, reaching parity with
the existing codeanalyzer-python and codeanalyzer-typescript Neo4j backends.
Motivation
----------
The Python (Py*/PY_*) and TypeScript (TS*/TS_*) analyzers already emit a Neo4j
property graph. Java should reach parity so all three can populate a single shared
graph database for cross-language tooling without label/relationship collisions.
Scope
-----
New --emit targets:
json (default, analysis.json)
neo4j (graph.cypher snapshot, or live Bolt push with --neo4j-uri)
schema (the schema.neo4j.json contract)
CLI options:
--app-name, --neo4j-uri, --neo4j-user, --neo4j-password, --neo4j-database
with NEO4J_URI / NEO4J_USERNAME / NEO4J_PASSWORD / NEO4J_DATABASE environment
fallback (precedence: CLI flag > env var > default).
Two writers over one projection:
- CypherWriter: deterministic, self-contained graph.cypher snapshot.
- BoltWriter: incremental live push (per-compilation-unit content_hash diff,
targeted replace of changed units, orphan prune on full runs).
Schema catalog as the single source of truth, with a no-container conformance test
asserting the projector never emits an undeclared label/relationship/property, plus
an opt-in Testcontainers Bolt integration test.
Parity / design decisions
-------------------------
- All node labels are J-prefixed and relationship types J_-prefixed (e.g. :JType,
:JCallable, J_CALLS); constraint and index names are j_-prefixed. This lets a
Java graph share a database with the Py*/TS* graphs without colliding.
- Provenance property is _module (matches the Python/TypeScript backends).
- The --emit schema output and the checked-in contract are both schema.neo4j.json.
- Lossless projection of the Lombok entity model: initialization blocks, CRUD
operations/queries, and comments are first-class nodes (:JInitializationBlock,
:JCrudOperation, :JCrudQuery, :JComment). Maps such as a field's per-variable
initializers are stored as a *_json property since Neo4j has no map type.
Packaging
---------
- Fat jar bundles the Neo4j driver; live Bolt push works with java -jar.
- The driver is reached through a driver-free BoltSink seam (loaded reflectively),
with a graceful fallback to writing graph.cypher when the driver is unavailable.
Known limitation
----------------
The GraalVM native image currently cannot run analysis at all (it dies in the
JavaParser symbol-table extraction before reaching any emit code), so --emit neo4j
is not yet usable from the native binary. This is a pre-existing, neo4j-independent
native reflection-metadata gap tracked separately in issue #153. The native build
does bundle the Neo4j driver and compiles cleanly; once #153 is fixed, --emit neo4j
should work from the native binary too.
Also, live population of the neo4j container will be a problem with how the driver
runs (with Netty and its heavy use of reflection). So in the native binary, --neo4j-uri
gracefully falls back to writing graph.cypher (the file appeared) and the users will
be gently requested to use the java -jar invocation for live database updates.
Status
------
Implemented on branch feature/neo4j (being renamed to
feature/issue-<this>-neo4j-and-fix-153). Schema conformance tests pass; fat-jar
live Bolt push verified end to end.