Skip to main content
View source

PostgreSQL (pgvector)

View as Markdown

A RocketRide store node that keeps document embeddings in PostgreSQL with the pgvector extension and retrieves them by semantic or keyword search.

What it does

Stores embedded document chunks in a regular PostgreSQL table with a pgvector vector column, then serves retrieval queries against it. Use this when you want vector storage inside an existing PostgreSQL database rather than a dedicated vector database.

Uses psycopg2 and the pgvector Python adapter (register_vector). The pgvector extension must already be installed in the target database, the node verifies this at config time by probing SELECT NULL::vector.

Key behavior to know:

  • The table is created automatically (CREATE TABLE IF NOT EXISTS) on first ingest, with the vector dimension taken from the incoming embeddings.
  • Re-ingesting a document replaces it: before inserting chunks, all existing rows with the same objectId are deleted, so updates do not accumulate duplicates.
  • Documents must be embedded upstream. Semantic search raises an error if the question carries no embedding, bind an embedding node before this one.
  • A hard minimum similarity floor of 0.20 is applied to semantic results in addition to the configurable retrieval score; matches below it are always dropped.
  • Soft-deleted rows (isDeleted = true) are excluded from search results by default.

Configuration

Lanes

Lane inLane outDescription
documents-Ingest pre-embedded documents into the table
questionsdocumentsReturn matching documents
questionsanswersReturn matching documents as an answer
questionsquestionsEnrich the question with matching documents for downstream nodes

Fields

FieldType / DefaultDescription
HoststringHost name or IP address of the PostgreSQL server
Portnumber · 5432Port number of the PostgreSQL server
Userstring · postgresUser to connect to the PostgreSQL server
Passwordstring (secure)Password to connect to the PostgreSQL server
Databasestring · postgresName of the database
Tablestring · rocketrideName of the table to store vectors
Retrieval Scorenumber · 0.5Minimum similarity threshold for returned matches
Similarity Metricenum · cosinecosine, l2, or inner_product

Table name rules

The table name must be a valid unquoted PostgreSQL identifier: start with a letter or underscore, contain only letters, digits, and underscores, and be at most 63 characters. Anything else (spaces, dashes, dots, quotes) is rejected at config validation and at startup. This is enforced deliberately, because the table name is interpolated into SQL.

Similarity metrics and scoring

The metric selects the pgvector distance operator and how raw distance is converted to a 0–1-style score:

MetricOperatorScore formula
cosine<=>1 - distance
l2<->1 / (1 + distance)
inner_product<#>-distance

The metric must match how the table was populated, switching metrics on an existing table changes ranking semantics without re-embedding anything.


Profiles

ProfileDescription
Local (default)Your own PostgreSQL server

Table schema

The auto-created table has these columns:

id (bigserial primary key), content, objectId, nodeId, parent, permissionId, isDeleted, chunkId, isTable, tableId, vectorSize, modelName, and embedding vector(N) where N is the embedding dimension of the first ingested batch.


Requirements

The pgvector extension must be installed in the target PostgreSQL database before connecting (CREATE EXTENSION vector;). Config validation connects with a 3-second timeout, runs SELECT 1, and casts NULL::vector, a clear provider error is surfaced if the extension is missing or the connection fails.


Upstream docs


Schema

FieldTypeDescriptionDefault
postgres.profilestringType of PostgreSQL host
Connect to...
"local"
postgres.providerstringconst: "postgres"
vector.collectionstringTable
Name of the table to store vectors.
"rocketride"
vector.local.databasestringDatabase
Name of the database
"postgres"
vector.local.hostHost
Host name or IP address of the PostgreSQL server
"your-postgres-host.example.com"
vector.local.passwordstringPassword
Password to connect to the PostgreSQL server
vector.local.portPort
Port number of the PostgreSQL server
5432
vector.local.userstringUser
User to connect to the PostgreSQL server
"postgres"
vector.similaritystringSimilarity Metric
The similarity metric to use for vector search
"cosine"

Dependencies

  • psycopg2-binary
  • pgvector