Skip to main content
View source

Neo4J

View as Markdown

A RocketRide database and tool node that answers natural-language questions against a Neo4J graph database by translating them to Cypher with a connected LLM.

What it does

Connects to a Neo4J instance over the Bolt protocol using the official neo4j Python driver and plays two roles. As a pipeline node, it receives natural-language questions on the questions lane, asks the connected LLM to generate a Cypher query, executes it against the graph, and emits results downstream on table, text, and answers. As a tool node, it exposes get_data, get_schema, and get_cypher directly to an agent. Designed for knowledge graph retrieval, entity linking, and graph-based RAG workflows.

The graph schema (node labels with property types, and relationship types with their start/end labels) is reflected once at pipeline start using db.schema.nodeTypeProperties() and db.schema.visualization() (falling back to db.labels() and db.relationshipTypes() on older servers) and included in every LLM prompt so Cypher is generated against the real graph structure.

The node is read-only by design: every generated or supplied Cypher statement must pass a safety check that rejects write and admin clauses (CREATE, MERGE, DELETE, DETACH DELETE, SET, REMOVE, DROP, FOREACH, LOAD CSV, and mutating apoc procedures), with comments stripped before checking. Queries time out after 30 seconds. The only escape hatch is the opt-in QuestionType.EXECUTE path, gated by allow_execute, which is off by default.


Connections

ConnectionRequiredDescription
llmyes (min 1)LLM used to generate Cypher from natural language

Configuration

Lanes

Lane inLanes outDescription
questionstable, text, answersTranslate question to Cypher, execute, emit results on each connected lane

For a normal question, results are emitted as a markdown table on table and answers, and as plain text on text. If the LLM judges the question unrelated to the graph, its text reply is forwarded in place of a query result.

Two special question types are handled on the questions lane:

  • QuestionType.DIALECT: emits {"dialect": "neo4j"} on the answers lane so SDK callers can detect they are talking to a graph database rather than a relational one.
  • QuestionType.EXECUTE: treats the question text as raw Cypher and runs it without LLM translation or the read-only safety check. Requires allow_execute: true; otherwise the request is silently rejected with a warning. Results are capped at 25,000 rows (the query fails if exceeded), and when a write returns no rows the emitted JSON reports affected_rows derived from the result summary counters (nodes/relationships created or deleted, properties set).

Fields

FieldTypeDescription
uristringDefault "neo4j://localhost:7687". Bolt URI for the Neo4J instance. Use neo4j:// or bolt:// for plaintext, neo4j+s:// or bolt+s:// for TLS (e.g. Neo4J Aura cloud)
auth_methodstringDefault "userpass".
userstringDefault "neo4j". Username to authenticate with the Neo4J instance.
passwordstringPassword to authenticate with the Neo4J instance.
tokenstringBearer token for token-based authentication (e.g. Neo4J Aura cloud).
databasestringDefault "neo4j". Name of the Neo4J database to connect to. Use 'neo4j' for the default database.
db_descriptionstringDefault empty. What is this graph used for? Describe its content and domain, this helps the LLM generate more accurate Cypher queries.
max_attemptsintegerDefault 5. Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated Cypher query
allow_executebooleanDefault false. Permit QuestionType.EXECUTE callers to run raw Cypher without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue Cypher directly.
profilestringDefault "default".

The default profile sets database: neo4j. Saving the node config runs a connectivity probe (RETURN 1 against the configured database) and surfaces driver errors (wrong password, unreachable host, bad database name) as warnings before the pipeline starts.


Available tools

When connected to an agent, the node exposes three functions namespaced under the node's prefix (e.g. neo4j.get_data):

Data retrieval

| Tool | Description | |---|---|---| | get_data | Accepts a natural-language description of the graph data you want, converts it to a safe Cypher MATCH query, executes it against the Neo4J graph database, and returns the result rows. No schema lookup or Cypher knowledge required, just describe what you need. Results may be large, consider using peek or store. | | get_schema | Returns the Neo4J graph schema: node labels with their properties and types, and relationship types with their start and end node labels. Do NOT call this preemptively, only use when get_data fails or returns unexpected results. | | get_cypher | Accepts a natural-language description and returns the equivalent Cypher MATCH statement without executing it. Only use when the user explicitly asks to see the Cypher, for actual data retrieval, use get_data instead. |

Schema inspection

ToolDescription
get_schemaReturns the cached graph schema: node labels with {property, type} lists and {type, start, end} relationship descriptors, plus the database name. Optional label argument filters to a single node label and its relationships. Intended for recovery when get_data fails or returns unexpected results; agents are instructed not to call it preemptively.

Cypher generation

ToolDescription
get_cypherTranslates a natural-language question to a Cypher MATCH statement without executing it. Returns {cypher, valid: true} on success, an error key if the generated Cypher is unsafe, or an answer text reply when the question is not a graph query. Use only when the caller explicitly needs to see the Cypher; for data retrieval use get_data instead.

Cypher generation and validation

Each generated query goes through a validate-and-retry loop:

  1. The LLM is prompted with the question, the reflected graph schema, the optional graph description, and strict instructions: only MATCH, OPTIONAL MATCH, WITH, WHERE, RETURN, ORDER BY, SKIP, and LIMIT are permitted, and a LIMIT clause must terminate the query.
  2. The generated Cypher is checked by the read-only safety filter (_is_cypher_safe).
  3. If the safety check passes, the query is validated with EXPLAIN against the live database. Any syntax error is fed back to the LLM together with the failing query, and the LLM retries up to max_attempts times (default 5, range 1-20).

As defence-in-depth, the safety filter also runs at execution time inside _run_query, so an unsafe statement is refused even if a caller bypasses the generation path.

Result values are serialised to plain JSON: graph nodes gain a _labels key, relationships a _type key, paths become {nodes, relationships}, and Neo4J temporal types are ISO-formatted.


Authentication

Username and password

Set auth_method: userpass (the default), then provide user and password. A blank username falls back to neo4j.

Bearer token

Set auth_method: token and provide token. The node passes it as Neo4J bearer auth (neo4j.bearer_auth). Use this for token-based setups such as Neo4J Aura cloud.

Connectivity and authentication are verified at pipeline start with verify_connectivity() plus an explicit RETURN 1 probe of the configured database, so a wrong database name or missing permissions fail fast rather than mid-pipeline.


Schema

FieldTypeDescriptionDefault
neo4jdb.allow_executebooleanAllow direct query execution
Permit QuestionType.EXECUTE callers to run raw Cypher without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue Cypher directly.
false
neo4jdb.auth_methodstringAuthentication"userpass"
neo4jdb.databasestringDatabase name
Name of the Neo4J database to connect to. Use 'neo4j' for the default database.
"neo4j"
neo4jdb.db_descriptionstringGraph description
What is this graph used for? Describe its content and domain, this helps the LLM generate more accurate Cypher queries.
""
neo4jdb.max_attemptsintegerMax validation attempts
Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated Cypher query
5
neo4jdb.passwordstringPassword
Password to authenticate with the Neo4J instance.
neo4jdb.profilestring"default"
neo4jdb.tokenstringBearer token
Bearer token for token-based authentication (e.g. Neo4J Aura cloud).
neo4jdb.uristringConnection URI
Bolt URI for the Neo4J instance. Use neo4j:// or bolt:// for plaintext, neo4j+s:// or bolt+s:// for TLS (e.g. Neo4J Aura cloud)
"neo4j://localhost:7687"
neo4jdb.userstringUser
Username to authenticate with the Neo4J instance.
"neo4j"

Dependencies

  • neo4j