Skip to main content
View source

ClickHouse

View as Markdown

A RocketRide database node that answers natural-language questions against a ClickHouse database by translating them into SQL with an LLM.

What it does

Plays two roles: a pipeline node (natural-language questions arrive on a lane, results leave as a table, text, or structured answers) and a tool node (an agent calls get_data, get_schema, or get_sql directly).

Connects over ClickHouse's native TCP protocol (default port 9000) using clickhouse-sqlalchemy with the clickhouse-driver backend (clickhouse+native:// DSN). Generated SQL is validated with EXPLAIN against the live database before execution; if validation fails the error is fed back to the LLM for a corrected query, repeating up to max_attempts times.

The node is read-only by default: the natural-language path only ever runs SELECT. Raw SQL execution (QuestionType.EXECUTE) is gated behind the allow_execute toggle, which is off by default and intended only for trusted callers. This is also a query/read node: it does not expose a pipeline ingestion (insert) lane (see Ingestion).


Connections

ConnectionRequiredDescription
llmyes (min 1)LLM used to generate SQL from natural language

Available tools

When connected to an agent, the node exposes three functions:

ToolDescription
get_dataNatural language to SQL, executes it, returns rows (default 250, max 25 000)
get_schemaReturns tables, columns, types, and primary keys
get_sqlNatural language to SQL only, no execution

Only SELECT is permitted for generated queries.


Configuration

Lanes

Lane inLane outDescription
questionstableTranslate question to SQL, execute, return as table
questionstextTranslate question to SQL, execute, return as text
questionsanswersTranslate question to SQL, execute, return as answers

Fields

FieldTypeDescription
hoststringDefault "localhost". Host name or IP address of the ClickHouse server, optionally including a native-protocol port (e.g. localhost:9440). Defaults to port 9000 when none is given.
userstringDefault "default". User to connect to the ClickHouse server
passwordstringPassword to connect to the ClickHouse server
databasestringDefault "default". Name of database
tlsbooleanDefault false. Connect over TLS. Required for managed services such as ClickHouse Cloud (native TLS port 9440 is assumed when the host has no explicit port). Leave OFF for a plaintext local server on port 9000. ClickHouse-specific, MySQL/PostgreSQL nodes do not expose this.
tablestringDefault "table". Name of table
db_descriptionstringDefault empty. What is this database used for? Describe its content and purpose, this helps the LLM generate more accurate queries.
max_attemptsintegerDefault 5. Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated SQL
allow_executebooleanDefault false. Permit QuestionType.EXECUTE callers to run raw SQL without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue SQL directly.
profilestringDefault "default".

The node ships a single default profile that pre-sets database: default.


SQL validation

Generated SQL is validated by running EXPLAIN against the live database. If validation fails, the error is fed back to the LLM for a corrected query. This repeats up to max_attempts times before the node raises an error. The retry limit is clamped to 1-20 regardless of the value stored in config.


ClickHouse Cloud

To connect to a ClickHouse Cloud service:

  1. In the Cloud console, open your service, choose Connect, and copy the native endpoint host (e.g. abc123.us-east-1.aws.clickhouse.cloud) and the default user password.
  2. Configure the node with: host = that hostname (no port needed, TLS port 9440 is assumed), user = default, password = your service password, tls = ON.
  3. Make sure your machine's IP is allowed under the service's IP Access List (or set it to "Anywhere" for testing).

Ingestion

Unlike the MySQL/PostgreSQL nodes, this node intentionally does not expose the ingestion/input answers lane (used for pipeline inserts). This removes only that input lane, not the questions to answers output lane used for querying, which still works. The shared auto-create-table helper builds tables with an auto-increment integer primary key and no table engine, neither of which exists in ClickHouse (tables require an explicit engine such as MergeTree), so the inherited insert/auto-create path cannot work here. Create your tables in ClickHouse directly, and use this node for querying. A ClickHouse-correct ingestion path can be added later as a separate feature.


Notes

  • ClickHouse is column-oriented and has no foreign keys; the reflected schema therefore exposes columns and (best-effort) primary keys but no FK relationships.
  • The tls config field is distinct from clickhouse-driver's ?secure=true DSN parameter (which the node sets on the wire when TLS is enabled) and from the field-level "secure": true attribute on the password field, which only marks the value as a masked secret.
  • user, password, and database are URL-encoded when building the connection string, so reserved characters (@, /, #, :) in credentials are safe.
  • Bracketed IPv6 literals (e.g. [::1]) are supported in host; a port is only detected when a : follows the closing ].

Schema

FieldTypeDescriptionDefault
clickhouse.allow_executebooleanAllow direct query execution
Permit QuestionType.EXECUTE callers to run raw SQL without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue SQL directly.
false
clickhouse.databasestringDatabase name
Name of database
"default"
clickhouse.db_descriptionstringDatabase description
What is this database used for? Describe its content and purpose, this helps the LLM generate more accurate queries.
""
clickhouse.hoststringClickHouse host
Host name or IP address of the ClickHouse server, optionally including a native-protocol port (e.g. localhost:9440). Defaults to port 9000 when none is given.
"localhost"
clickhouse.max_attemptsintegerMax validation attempts
Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated SQL
5
clickhouse.passwordstringPassword
Password to connect to the ClickHouse server
clickhouse.profilestring"default"
clickhouse.tablestringTable name
Name of table
"table"
clickhouse.tlsbooleanUse TLS
Connect over TLS. Required for managed services such as ClickHouse Cloud (native TLS port 9440 is assumed when the host has no explicit port). Leave OFF for a plaintext local server on port 9000. ClickHouse-specific, MySQL/PostgreSQL nodes do not expose this.
false
clickhouse.userstringUser
User to connect to the ClickHouse server
"default"

Dependencies

  • clickhouse-sqlalchemy ==0.3.2
  • clickhouse-driver ==0.2.9