# summarization

A RocketRide filter node that uses an LLM to distill incoming text into a summary, key points, and named entities.

## What it does

Accumulates all text (and table content) of each object flowing through the pipeline, then on close splits the full document into chunks and asks the connected LLM to extract three things from each chunk: a concise summary, a list of key points, and the most significant named entities (people, organizations, products, events, dates, locations). The LLM is instructed to respond as JSON.

Chunking uses LangChain's `RecursiveCharacterTextSplitter`, sized to the connected LLM's context length and measured with the LLM's own token counter. The token budget of the instruction prompt itself is accounted for, so each chunk plus the prompt always fits the model's context window.

Only the first `numberOfSummaries` chunks (default 2) are summarized; the rest of the document is ignored. Each of the three extraction sections can be disabled individually by setting its config field to `0`.

Output is emitted only on lanes that actually have a downstream listener: plain formatted text on the `text` lane, and/or one structured document per section on the `documents` lane.

---

## Configuration

### Lanes

| Lane in | Lane out | Description |
|---------|----------|-------------|
| `text` | `text` | Summarized output as plain text: a summary block, a `Key Points:` bullet list, and an `Entities:` bullet list. Sections that are disabled or empty are omitted. |
| `text` | `documents` | Summarized output as structured documents: each summary, key-point list, and entity list becomes its own `Doc` with an incrementing `chunkId` in its metadata. |

Both table and plain-text input are accepted; table content is appended to the same accumulator as text and summarized together with it.

### Fields

| Field | Type | Description |
|---|---|---|
| `numberOfSummaries` | number |  |
| `numberOfSummaryWords` | number |  |
| `numberOfKeyPointWords` | number |  |
| `numberOfEntities` | number |  |
| `profile` | string | Default "default".  |

The node ships a single `default` profile (selected via the hidden `summarization.profile` field) that exposes the four fields above.

---

## Connections

| Channel | Required | Description |
|---------|----------|-------------|
| `llm` | yes (min 1) | LLM used to generate summaries, key points, and entities. Also provides the context length and token counter used for chunking. |

---

<!-- ROCKETRIDE:GENERATED:PARAMS START -->
<!-- Generated by nodes:docs-generate. Do not edit by hand. -->

## Schema

| Field | Type | Description | Default |
|---|---|---|---|
| `summarization.numberOfEntities` | `number` | **Number of entities to extract from the document. Set to 0 to disable entity extraction.** |  |
| `summarization.numberOfKeyPointWords` | `number` | **Number of words in each key point. Set to 0 to disable key points.** |  |
| `summarization.numberOfSummaries` | `number` | **Number of chunks to summarize after the document is split** |  |
| `summarization.numberOfSummaryWords` | `number` | **Number of words in each summary. Set to 0 to disable summaries.** |  |
| `summarization.profile` | `string` |  | `"default"` |

## Dependencies

- `langchain`

## Source

[<svg viewBox="0 0 16 16" width="15" height="15" fill="currentColor" aria-hidden="true" style="vertical-align:-0.15em;margin-right:0.35em"><path d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"/></svg> View source](https://github.com/rocketride-org/rocketride-server/tree/develop/nodes/src/nodes/summarization)
<!-- ROCKETRIDE:GENERATED:PARAMS END -->
