Skip to main content
View source

Pinecone

View as Markdown

A RocketRide vector store node that stores and retrieves embedded documents in a Pinecone index, with agent-callable search, upsert, and delete tools.

What it does

Stores pre-embedded document chunks in a Pinecone index and retrieves them by semantic (vector) similarity or keyword search. Documents must pass through an embedding node before reaching this node - chunks without an embedding are rejected at write time.

Uses the Pinecone gRPC SDK (PineconeGRPC) for all data-plane operations and the HTTP client during configuration validation. What RocketRide calls a collection maps to a Pinecone index (Pinecone's own "collections" feature is not used).

The index is created automatically on first write if it does not yet exist, with the vector dimension taken from the incoming embeddings. Re-ingesting documents whose objectId already exists removes the old chunks before writing the new ones, making writes effectively upserts. Vectors are upserted in batches of 50 (well below Pinecone's 100-vector bulk limit) with a 32 MiB payload cap per batch.

The node carries classType: ["store", "tool"] and the invoke capability. In addition to its data lanes it exposes vector-DB tools to agent nodes in the same pipeline.


Configuration

Lanes

Lane inLane outDescription
documents(none)Ingest pre-embedded documents into the index
questionsdocumentsRun a search and return matching documents
questionsanswersRun a search and return matching documents as an answer
questionsquestionsEnrich the question with matching documents for downstream nodes

Semantic search requires the incoming question to carry an embedding (bind an embedding node upstream) and returns up to the filter's limit (default 25) top matches scored above the configured threshold. Non-zero offsets are not supported for semantic search and raise an error. Keyword search adds a $contains metadata filter on document content.

Fields

FieldTypeDescription
collectionstringDefault "rocketride". Enter the name of the collection. Accepted are: Lower case, alphanumeric characters, hyphens
serverNamestringDefault "pinecone". Namespace for agent-facing tool names, e.g. 'pinecone' exposes tools as pinecone.search / pinecone.upsert / pinecone.delete. Change this when running multiple Pinecone nodes in the same pipeline so their tool names do not collide.
profilestringDefault "pod-based". Connect to...
providerstringDefault "pinecone".

Collection naming rules

Configuration validation checks all of the following and reports violations together:

  • lowercase letters, numbers, and hyphens only
  • no leading or trailing hyphen
  • no consecutive hyphens (--)
  • maximum 45 characters

Validation also authenticates with the API key, and if the index already exists it checks that its deployment type matches the selected profile: a serverless index cannot be used with the pod-based profile and vice versa.


Profiles

ProfileDefaultDescription
serverless-denseyesServerless deployment. New indexes are created in AWS us-east-1.
pod-basednoPod-based deployment. New indexes are created in us-east1-gcp with 1 x p1.x1 pod.

The default profile is serverless-dense.


Agent tools

When an agent node is wired to this node, the following tools become callable, namespaced by the configured serverName (default pinecone):

ToolDescription
pinecone.searchSemantic similarity search. Takes query text, optional top_k (default 10), and an optional metadata filter object. Returns matching documents with content, metadata, and score.

Write

ToolDescription
pinecone.upsertAdd or update documents. Each document requires content and an object_id (used for deduplication). Accepts optional metadata, or a pre-computed embedding plus embedding_model pair to skip automatic embedding computation.

Delete

ToolDescription
pinecone.deleteDelete documents by a list of object_ids.

The tool path runs on the control plane and does not pass through the pipeline's embedding lanes. The node wires its own query embedder from the pipeline's embedding configuration. pinecone.search requires that embedder; if none is configured, the call returns {"success": false, "error": ...}. pinecone.upsert computes embeddings the same way, or accepts pre-computed vectors per document to skip auto-computation.


Behavior notes

  • Soft delete: documents can be marked deleted (isDeleted: true) rather than physically removed. They are excluded from all search results unless the filter explicitly requests deleted documents. Documents that reappear are automatically marked active again.
  • Metadata updates: applied per record in batches of 1000, querying only records whose metadata still differs from the target state so the loop converges safely and does not re-process already-updated records.
  • Rendering: rehydrates a complete document from its chunks in chunkId order, streaming the joined text to the output callback in ranges of 32 MiB.
  • Pagination: Pinecone has no native query offset. Path listing emulates pagination by over-fetching (offset + limit) records and slicing client-side.

Authentication

Set apikey to your Pinecone API key. The key is used during both config validation (HTTP client) and runtime data operations (gRPC client). There are no additional auth modes - Pinecone uses API-key-only authentication.


Schema

FieldTypeDescriptionDefault
pinecone.collectionstringCollection
Enter the name of the collection. Accepted are: Lower case, alphanumeric characters, hyphens
"rocketride"
pinecone.profilestringType of Pinecone Connection
Connect to...
"pod-based"
pinecone.providerstringconst: "pinecone"
pinecone.serverNamestringTool Server Name
Namespace for agent-facing tool names, e.g. 'pinecone' exposes tools as pinecone.search / pinecone.upsert / pinecone.delete. Change this when running multiple Pinecone nodes in the same pipeline so their tool names do not collide.
"pinecone"

Dependencies

  • pinecone
  • pinecone-plugin-assistant
  • pinecone-plugin-interface