LlamaParse
A RocketRide data node that parses documents with the LlamaParse cloud service and emits the extracted text and tables into the pipeline.
What it does
Sends incoming documents to the LlamaIndex / LlamaParse cloud API (via the llama-parse Python library) and emits the results. Handles PDFs, images, Word documents, Excel spreadsheets, and other formats, with table extraction, layout preservation, and Markdown output. Processing happens in the cloud: a LlamaIndex API key is required, and the run aborts at startup if none is configured.
Tables are extracted two ways: Markdown table patterns (|-delimited rows) are detected in the parsed text, and structured items of type table returned by the API are converted to Markdown. Both are written to the table lane.
A single shared parser instance is guarded by a lock, so documents are parsed one at a time per node. If parsing returns no text, a fallback Markdown notice (file type, size, and a hint that the document may be empty or OCR failed) is written to the text lane; on a parsing error, a Markdown error report is written instead, so downstream nodes always receive something.
Configuration
Lanes
| Lane in | Lane out | Description |
|---|---|---|
data | text | Parse document, emit extracted text |
data | table | Parse document, emit extracted tables (Markdown) |
data | documents | Parse document, emit full document objects (when listener exists) |
Incoming document objects of type Document, PDF, or Image (base64 page_content) are also accepted on the documents lane; the node parses them, re-emits them as Document objects containing the parsed text, and suppresses the default pass-through of the original files.
When no file name accompanies the raw bytes, the file type is detected from magic numbers (PDF, DOCX/XLSX, DOC/XLS, JPEG, PNG, GIF, WebP, HTML, XML) and defaults to PDF if unrecognized.
Fields
| Field | Type | Description |
|---|---|---|
use_advanced_config | boolean | Default false. Check to use advanced JSON configuration instead of simple options. |
api_key | string | Your LlamaIndex API key for LlamaParse service |
parse_mode | string | Default "parse_page_with_lvm". The parse mode to use for chosing complexity of the parse |
lvm_model | string | Default "anthropic-sonnet-4.0". The LVM model to use for parsing when LVM or agentic modes are selected. |
use_system_prompt_append | boolean | Default false. Check to add custom instructions to the system prompt for LlamaParse. |
system_prompt_append | string | Additional instructions to append to the system prompt for LlamaParse. |
spreadsheet_extract_sub_tables | boolean | Default false. Extract sub-tables from spreadsheets for better table parsing. |
advanced_config | string | Default "{ |
| "parse_mode": "parse_page_with_llm", | ||
| "spreadsheet_extract_sub_tables": false, | ||
| "system_prompt_append": "", | ||
| "lvm_model": "anthropic-sonnet-4.0" | ||
| }". Enter configuration options in JSON format. For more information, see: LlamaParse Documentation |
"parse_mode": "parse_page_with_llm", "spreadsheet_extract_sub_tables": false, "system_prompt_append": "", "lvm_model": "anthropic-sonnet-4.0" }". Enter configuration options in JSON format. For more information, see: LlamaParse Documentation |
Simple mode options (shown when use_advanced_config is off):
| Field | Type / default | Description |
|---|---|---|
parse_mode | string, default parse_page_with_lvm | Parse mode, see parse modes below |
lvm_model | string, default anthropic-sonnet-4.0 | Vision model used for LVM and agentic modes |
use_system_prompt_append | boolean, default false | Append custom instructions to the parsing system prompt |
system_prompt_append | string (textarea) | The additional instructions (shown when the toggle is on) |
spreadsheet_extract_sub_tables | boolean, default false | Extract sub-tables from spreadsheets |
Advanced mode option (shown when use_advanced_config is on):
| Field | Type / default | Description |
|---|---|---|
advanced_config | string (JSON) | Raw JSON passed to the LlamaParse constructor, see below |
Configuration is validated at save time: invalid JSON in the advanced config, unknown advanced parameters, and a missing API key all produce warnings. At pipeline start, a missing API key, invalid advanced JSON, or an enabled-but-empty advanced config aborts the run with an error before any documents are sent.
Parse modes
| Mode | Credits/page | Best for |
|---|---|---|
| Cost-effective | 3 | Text-heavy documents without diagrams |
| Agentic | 10 | Documents with diagrams and images |
| Agentic Plus | 90 | Complex layouts and multi-page tables |
| Parse with LVM (legacy) | n/a | Legacy LVM-based parsing |
Simple-mode selections map onto LlamaParse API modes:
- Cost-effective:
parse_page_with_llm - Agentic and Agentic Plus:
parse_page_with_agent, with the selected LVM model asvendor_multimodal_model_name - Parse with LVM (legacy):
parse_page_with_lvm, with the selected LVM model, any additional instructions (system_prompt_append, only applied in this mode when Use Additional Instructions is on), andpage_error_tolerancefixed at0.05
LVM models
Available when using LVM legacy, Agentic, or Agentic Plus modes:
| Model | Value |
|---|---|
| Anthropic Sonnet 4.0 (default) | anthropic-sonnet-4.0 |
| Anthropic Sonnet 3.5 | anthropic-sonnet-3.5 |
| GPT-4o | gpt-4o |
| GPT-4o Mini | gpt-4o-mini |
Advanced configuration (JSON mode)
When Advanced Configuration is enabled, supply a raw JSON object instead of the simple options. The keys are merged directly into the LlamaParse constructor arguments (any api_key key in the JSON is ignored; the API key field always wins). Recognized parameters:
| Key | Type | Description |
|---|---|---|
parse_mode | string | API-level parse mode passed directly to LlamaIndex. Accepted values: parse_page_with_llm (cost-effective text parsing), parse_page_with_agent (agentic/diagram-aware parsing), parse_page_with_lvm (legacy LVM-based parsing). Note: simple-mode aliases (agentic, agentic_plus, cost_effective) are not valid here; they are only mapped in simple mode. |
system_prompt_append | string | Text appended to the parsing system prompt. In advanced mode, this is honored directly from the JSON payload regardless of simple-mode toggles. In simple mode, only applied in LVM legacy mode (parse_page_with_lvm) when Use Additional Instructions is on. |
spreadsheet_extract_sub_tables | boolean | Extract sub-tables embedded within spreadsheet cells. Corresponds to the Extract Sub Tables toggle in simple mode. |
vendor_multimodal_model_name | string | Vision model used for LVM and agentic modes (e.g. anthropic-sonnet-4.0). |
page_error_tolerance | number | Fraction of pages allowed to fail before the job is aborted (default 0.05 in LVM legacy mode). |
verbose | boolean | LlamaParse client verbosity (the node sets false by default). |
Advanced mode bypasses all simple-mode settings. Unknown keys produce a warning at save time but do not abort execution.
Timeouts and large files
The cloud call runs in an isolated thread with its own event loop (LlamaParse uses asyncio internally and can otherwise raise "Event loop is closed" errors). The per-document timeout scales with file size:
| File size | Timeout |
|---|---|
| up to 100 MB | 5 minutes |
| 100-500 MB | 10 minutes |
| over 500 MB | 15 minutes |
For files over 50 MB, the LlamaParse job timeout is also raised to at least 10 minutes. If the call times out, the node returns an empty result with the timeout recorded in the parsing metadata and the pipeline continues; it does not hang the run.
Authentication
Obtain a LlamaIndex API key from cloud.llamaindex.ai and paste it into the API Key field. The field is stored securely (marked secure: true in the node config). The key is required; the node aborts at startup if it is absent.
Upstream docs
Schema
| Field | Type | Description | Default |
|---|---|---|---|
llamaparse.advanced_config | string | Advanced Configuration (JSON) Enter configuration options in JSON format. For more information, see: LlamaParse Documentation | "{\n \"parse_mode\": \"parse_page_with_llm\",\n \"spreadsheet_extract_sub_tables\": false,\n \"system_prompt_append\": \"\",\n \"lvm_model\": \"anthropic-sonnet-4.0\"\n}" |
llamaparse.api_key | string | API Key Your LlamaIndex API key for LlamaParse service | |
llamaparse.lvm_model | string | LVM Model The LVM model to use for parsing when LVM or agentic modes are selected. | "anthropic-sonnet-4.0" |
llamaparse.parse_mode | string | Parse Mode The parse mode to use for chosing complexity of the parse | "parse_page_with_lvm" |
llamaparse.spreadsheet_extract_sub_tables | boolean | Extract Sub Tables Extract sub-tables from spreadsheets for better table parsing. | false |
llamaparse.system_prompt_append | string | Additional Instructions Additional instructions to append to the system prompt for LlamaParse. | |
llamaparse.use_advanced_config | boolean | Advanced Configuration Check to use advanced JSON configuration instead of simple options. | false |
llamaparse.use_system_prompt_append | boolean | Use Additional Instructions Check to add custom instructions to the system prompt for LlamaParse. | false |
Dependencies
llama-parsellama-index-corellama-cloud