Runtime & Engine

Runtime & engine

Pipelines don't run themselves. The engine is the runtime that loads a .pipe definition, brings its nodes to life, and moves data through the graph until the work is done.

A multithreaded C++ core

The engine is a native, multithreaded C++ runtime, not a thin wrapper around HTTP calls. It is built for throughput and reliability: nodes that have no dependency on one another run concurrently, and data streams between them rather than buffering the whole pipeline in memory. The same binary powers a quick local iteration loop and a production workload.

What the engine does

When a pipeline starts, the engine:

Parses the .pipe JSON and validates it against the pipeline schema.
Instantiates each component from its provider, applying the component's config (API keys, model profiles, collection names, and so on).
Wires the graph: connecting output data lanes to the input lanes that consume them, and resolving control (invoke) connections between agents and the LLMs, tools, and memory they drive.
Streams data through the graph, scheduling work across threads and emitting results as they are produced.
Tears down the run when the inputs are exhausted or the client calls terminate().

For the full picture of how data and control flow at step 4, see the Execution model.

One engine, three places to run it

The pipeline JSON never changes across environments, only where the engine lives does:

Locally: the engine runs on your machine while you build and debug, e.g. behind the VS Code extension.
On-premises: self-host the engine with Docker inside your own network. See Self-hosting.
RocketRide Cloud: a managed engine you connect to instead of running your own. See Cloud.

Talking to the engine

You never call the engine's internals directly. Clients connect over one of two protocols and the engine handles the rest:

WebSocket: the native engine protocol. The TypeScript and Python SDKs speak it for you.
MCP: exposes running pipelines as tools for AI assistants.

As pipelines run, the engine reports call trees, token usage, and memory so you can observe what happened. See Troubleshooting for reading that signal.

Next steps

Execution model: how the engine schedules and streams a run.
Nodes: the components the engine instantiates.
Self-hosting: run the engine in your own infrastructure.

A multithreaded C++ core​

What the engine does​

One engine, three places to run it​

Talking to the engine​

Next steps​

A multithreaded C++ core

What the engine does

One engine, three places to run it

Talking to the engine

Next steps