# embedding_openai

A RocketRide embedding node that converts document chunks and search queries into vectors using OpenAI's embedding API.

## What it does

Generates text embeddings using OpenAI's embedding API. Documents arriving on the `documents` lane have an `embedding` vector (and the `embedding_model` name) attached to each chunk, ready for ingestion into a vector store. Questions arriving on the `questions` lane are embedded with the same model so they can be matched against a stored index.

Uses **langchain-openai** (`OpenAIEmbeddings`) under the hood. Document chunks are embedded in a single batched `embed_documents` call per write; questions are embedded one at a time with `embed_query`. An empty document list is a no-op.

At pipeline startup the node makes one small probe call (embedding the string `"dummy"`) to discover the model's vector size, since the API does not report it directly: expect one extra tiny request when the pipeline starts. The maximum token count per request is read from the model's context length (`embedding_ctx_length`).

Requires an OpenAI API key.

---

## Configuration

### Lanes

| Lane        | Direction | Description                                                       |
| ----------- | --------- | ----------------------------------------------------------------- |
| `documents` | in / out  | Embed document text; attach vector to each chunk for vector store ingestion |
| `questions` | in / out  | Embed a query string; attach vector for similarity search against a stored index |

### Fields

| Field | Type | Description |
|---|---|---|
| `model` | string | OpenAI model to use for embedding |
| `profile` | string | Default "text-embedding-3-small". OpenAI embedding model |

Each profile resolves to a `model` name and token limit that the wrapper passes to `OpenAIEmbeddings`.

### Profiles

| Profile                | Model                    | Notes                                         |
| ---------------------- | ------------------------ | --------------------------------------------- |
| Text Small _(default)_ | `text-embedding-3-small` | Efficient, good general-purpose performance   |
| Text Large             | `text-embedding-3-large` | Higher accuracy, larger vector representation |
| Text Ada               | `text-embedding-ada-002` | Legacy model (first OpenAI embeddings model)  |

All three models accept up to 8,191 tokens per input.

---

## Authentication

Set the `apikey` field to your OpenAI API key. The key is resolved per profile, so if you switch profiles you may supply a different key for each. There is no support for organization-scoped keys or Azure OpenAI endpoints in this node.

---

<!-- ROCKETRIDE:GENERATED:PARAMS START -->
<!-- Generated by nodes:docs-generate. Do not edit by hand. -->

## Schema

| Field | Type | Description | Default |
|---|---|---|---|
| `openai_embed.model` | `string` | **Model name**<br/>OpenAI model to use for embedding |  |
| `openai_embed.profile` | `string` | **Model**<br/>OpenAI embedding model | `"text-embedding-3-small"` |

## Dependencies

- `openai`
- `langchain-openai`

## Source

[<svg viewBox="0 0 16 16" width="15" height="15" fill="currentColor" aria-hidden="true" style="vertical-align:-0.15em;margin-right:0.35em"><path d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"/></svg> View source](https://github.com/rocketride-org/rocketride-server/tree/develop/nodes/src/nodes/embedding_openai)
<!-- ROCKETRIDE:GENERATED:PARAMS END -->
