Summarization: LLM
A RocketRide filter node that uses an LLM to distill incoming text into a summary, key points, and named entities.
What it does
Accumulates all text (and table content) of each object flowing through the pipeline, then on close splits the full document into chunks and asks the connected LLM to extract three things from each chunk: a concise summary, a list of key points, and the most significant named entities (people, organizations, products, events, dates, locations). The LLM is instructed to respond as JSON.
Chunking uses LangChain's RecursiveCharacterTextSplitter, sized to the connected LLM's context length and measured with the LLM's own token counter. The token budget of the instruction prompt itself is accounted for, so each chunk plus the prompt always fits the model's context window.
Only the first numberOfSummaries chunks (default 2) are summarized; the rest of the document is ignored. Each of the three extraction sections can be disabled individually by setting its config field to 0.
Output is emitted only on lanes that actually have a downstream listener: plain formatted text on the text lane, and/or one structured document per section on the documents lane.
Configuration
Lanes
| Lane in | Lane out | Description |
|---|---|---|
text | text | Summarized output as plain text: a summary block, a Key Points: bullet list, and an Entities: bullet list. Sections that are disabled or empty are omitted. |
text | documents | Summarized output as structured documents: each summary, key-point list, and entity list becomes its own Doc with an incrementing chunkId in its metadata. |
Both table and plain-text input are accepted; table content is appended to the same accumulator as text and summarized together with it.
Fields
| Field | Type | Description |
|---|---|---|
numberOfSummaries | number | |
numberOfSummaryWords | number | |
numberOfKeyPointWords | number | |
numberOfEntities | number | |
profile | string | Default "default". |
The node ships a single default profile (selected via the hidden summarization.profile field) that exposes the four fields above.
Connections
| Channel | Required | Description |
|---|---|---|
llm | yes (min 1) | LLM used to generate summaries, key points, and entities. Also provides the context length and token counter used for chunking. |
Schema
| Field | Type | Description | Default |
|---|---|---|---|
summarization.numberOfEntities | number | Number of entities to extract from the document. Set to 0 to disable entity extraction. | |
summarization.numberOfKeyPointWords | number | Number of words in each key point. Set to 0 to disable key points. | |
summarization.numberOfSummaries | number | Number of chunks to summarize after the document is split | |
summarization.numberOfSummaryWords | number | Number of words in each summary. Set to 0 to disable summaries. | |
summarization.profile | string | "default" |
Dependencies
langchain