Ollama

A RocketRide LLM node that routes pipeline traffic through a locally-hosted Ollama server.

What it does

Provides text generation against an Ollama server running on your own hardware. The node acts as an llm invoke connection for agents and other nodes that need an LLM, and can also be driven directly via its questions / answers lane pair. Because all inference happens on-premise, no external API key is required, making it a natural fit for privacy-sensitive or air-gapped deployments.

Internally, the node talks to Ollama through its OpenAI-compatible /v1 endpoint using langchain-openai (ChatOpenAI). temperature is configurable (default 0; reasoning models such as gpt-oss auto-use 1.0 when it is left unset), and reasoning_effort can be set for reasoning models. If the configured serverbase URL does not end in /v1, the node appends it automatically, so both http://localhost:11434 and http://localhost:11434/v1 are accepted. The OpenAI client requires a non-empty API key field; the node sends the placeholder string dummy-key, which Ollama ignores.

Configuration

Lanes

Lane in	Lane out	Description
`questions`	`answers`	Send a question directly, receive a generated answer

Fields

Pick a profile from the dropdown; the profile pre-fills model, serverbase, and modelTotalTokens. All three fields are individually overridable when using the custom profile.

Field	Type	Description
`model`	string	Ollama model
`modelTotalTokens`	number	Total Tokens
`profile`	string	Default "llama3_3". LLM model

Profiles

Llama

Profile	Model	Context
Llama 4 Latest	`llama4:latest`	10,000,000
Llama 3.3 (default)	`llama3.3:latest`	128,000
Llama 3.1 405B	`llama3.1:405b`	128,000
Llama 3.1 70B	`llama3.1:70b`	128,000
Llama 3.1 8B	`llama3.1:8b`	128,000
Llama 3.2 3B	`llama3.2`	128,000
Llama 3.2 1B	`llama3.2:1b`	128,000

Qwen

Profile	Model	Context
Qwen 3 Latest	`qwen3:latest`	128,000
Qwen 2.5 72B	`qwen2.5:72b`	128,000
Qwen 2.5 32B	`qwen2.5:32b`	128,000
Qwen 2.5 14B	`qwen2.5:14b`	128,000
Qwen 2.5 7B	`qwen2.5`	128,000
Qwen 2.5 3B	`qwen2.5:3b`	128,000
Qwen 2.5 1.5B	`qwen2.5:1.5b`	128,000
Qwen 2.5 0.5B	`qwen2.5:0.5b`	128,000

DeepSeek

Profile	Model	Context
DeepSeek R1 671B	`deepseek-r1:671b`	128,000
DeepSeek R1 32B	`deepseek-r1:32b`	128,000
DeepSeek R1 14B	`deepseek-r1:14b`	128,000
DeepSeek R1 7B	`deepseek-r1:7b`	128,000
DeepSeek R1 1.5B	`deepseek-r1:1.5b`	128,000

Other

Profile	Model	Context
Phi 4 14B	`phi4`	16,000
Mistral 7B	`mistral`	32,000

Custom: supply any Ollama model tag, context token count, and server URL directly. The default context for a new custom profile is 16,385 tokens until you change it.

Upstream docs

Schema

Field	Type	Description	Default
`model`	`string`	Model Ollama model
`modelTotalTokens`	`number`	Tokens Total Tokens
`ollama.profile`	`string`	Model LLM model	`"llama3_3"`
`reasoning_effort`	`string`	Reasoning Effort Optional reasoning budget for reasoning models: low, medium, or high. Leave unset to let reasoning models auto-use 'low'; a value set here always wins. Ignored by non-reasoning models.
`temperature`	`number`	Temperature Sampling temperature. Left unset by default so ollama.py can choose: 0 for standard models, and 1.0 for reasoning models (gpt-oss, deepseek-r1, qwen3, qwq, ...) so they emit a final answer instead of looping on empty output. Set an explicit value here to override the auto behavior.

Dependencies

langchain-openai
langchain-core
langchain

What it does​

Configuration​

Lanes​

Fields​

Profiles​

Upstream docs​

Schema​

Dependencies​