MySQL

View as Markdown

A RocketRide database node that answers natural-language questions against a MySQL database and inserts structured pipeline data into a MySQL table.

What it does

The node operates in two roles. As a pipeline node, it receives natural-language questions on the questions lane, uses a connected LLM to translate them into SQL, executes the query, and emits results on the table, text, and answers output lanes; it also receives structured data on the answers input lane and inserts rows into the configured table. As a tool node, it exposes get_data, get_schema, and get_sql to an agent so the agent can query the database by describing the data it wants.

Connects via SQLAlchemy with the pymysql driver (mysql+pymysql:// DSN; user, password, and database name are URL-encoded so reserved characters don't break the connection string). The connection pool allows up to 30 concurrent connections (pool_size=10, max_overflow=20). The full database schema (tables, columns, types, primary keys, and foreign keys) is reflected once at startup and supplied to the LLM as context for every query.

Only SELECT statements are permitted for queries; all generated SQL passes a whitelist safety check before execution. Inserts go through the answers lane, not through SQL. Saving the node config probes the server with SELECT 1 (5-second connect timeout) and surfaces the driver error verbatim if the connection fails.

Configuration

Lanes

Lane in	Lanes out	Description
`questions`	`table`, `text`, `answers`	Translate question to SQL, execute it. Results go to `text` as a string, to `table` as a markdown table (valid queries only), and to `answers` as a markdown table (or the LLM's text response when the question is not a database query).
`answers`	(none)	Parse structured JSON data and insert it into the configured table

Two special question types are handled on the questions lane:

QuestionType.DIALECT: emits {"dialect": "mysql"} on the answers lane so SDK callers can branch on the underlying engine.
QuestionType.EXECUTE: runs the question text as raw SQL, bypassing LLM translation and the safety check. Disabled unless allow_execute is on; see the "Direct SQL execution" section below.

Fields

Field	Type	Description
`host`	string	Default "localhost". Host name or IP address of the MySQL server
`user`	string	Default "root". User to connect to the MySQL server
`password`	string	Password to connect to the MySQL server
`database`	string	Default "database". Name of database
`table`	string	Default "table". Name of table
`db_description`	string	Default empty. What is this database used for? Describe its content and purpose, this helps the LLM generate more accurate queries.
`max_attempts`	integer	Default 5. Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated SQL
`allow_execute`	boolean	Default false. Permit QuestionType.EXECUTE callers to run raw SQL without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue SQL directly.
`profile`	string	Default "default".

The node has a single default profile containing all fields above.

Connections

Connection	Required	Description
`llm`	yes (min 1)	LLM used to craft SQL queries from natural-language questions

Available tools

When connected to an agent, the node exposes three functions:

Tool	Description
`get_data`	Natural language to SQL `SELECT`, executed; returns result rows plus the generated SQL. `limit` defaults to 250 rows, max 25,000.
`get_schema`	Returns the database schema: tables, columns, types, primary keys, and foreign keys. Pass a `table` name for a single table, or omit it for the full schema.
`get_sql`	Natural language to SQL only: returns the generated `SELECT` statement without executing it

SQL generation and validation

Generated SQL goes through two gates before execution:

Safety whitelist: only SELECT statements (optionally prefixed by EXPLAIN) are allowed. Comments are stripped first so embedded keywords can't fool the check, each statement in a multi-statement string is checked separately, SELECT ... INTO OUTFILE / INTO DUMPFILE is blocked, and WITH (CTE) is deliberately rejected because CTEs can wrap mutations.
EXPLAIN validation: the query is validated by running EXPLAIN against the live database without executing it. If EXPLAIN rejects the query, the rejected SQL and the database error are fed back to the LLM for a corrected attempt. This repeats up to max_attempts times (default 5) before the last result is returned as-is.

If the LLM decides the question is not a database query at all, it answers in plain text instead, and that text is returned on the text and answers lanes (or in the tool result's answer field).

Inserting data

Rows arriving on the answers lane are inserted into the configured table:

The table is auto-created on first insert if it doesn't exist, with column types inferred from the data (int, float, datetime, and text detection; short strings become VARCHAR(255), long ones TEXT) and an auto-increment id primary key prepended. A startup warning is logged when the table is missing.
Incoming keys are matched to schema columns case-insensitively (UserName maps to username); schema columns absent from the data are inserted as NULL, and keys not in the schema are dropped.
Lists and dicts are serialised to JSON strings; booleans are stored as 0/1.

Direct SQL execution

When allow_execute is true, callers sending QuestionType.EXECUTE can run raw SQL (reads or writes) with no LLM translation and no safety whitelist. Writes auto-commit; results are emitted on the text, table, and answers lanes, with the answers payload containing rows and affected_rows. SELECT results are capped at 25,000 rows (configurable via the max_execute_rows config key); exceeding the cap fails the query rather than truncating it.

This is off by default. Leave it off unless a trusted application explicitly needs to issue SQL directly: with it enabled, any pipeline caller that can reach the node can run arbitrary statements against the database.

Schema

Field	Type	Description	Default
`mysql.allow_execute`	`boolean`	Allow direct query execution Permit QuestionType.EXECUTE callers to run raw SQL without LLM translation or safety checks. Leave OFF unless a trusted application explicitly needs to issue SQL directly.	`false`
`mysql.database`	`string`	Database name Name of database	`"database"`
`mysql.db_description`	`string`	Database description What is this database used for? Describe its content and purpose, this helps the LLM generate more accurate queries.	`""`
`mysql.host`	`string`	MySQL host Host name or IP address of the MySQL server	`"localhost"`
`mysql.max_attempts`	`integer`	Max validation attempts Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated SQL	`5`
`mysql.password`	`string`	Password Password to connect to the MySQL server
`mysql.profile`	`string`		`"default"`
`mysql.table`	`string`	Table name Name of table	`"table"`
`mysql.user`	`string`	User User to connect to the MySQL server	`"root"`

Dependencies

pymysql >=1.2.0
cryptography

What it does​

Configuration​

Lanes​

Fields​

Connections​

Available tools​

SQL generation and validation​

Inserting data​

Direct SQL execution​

Schema​

Dependencies​