GMI Cloud
A RocketRide LLM node that connects GMI Cloud-hosted models to a pipeline through GMI Cloud's OpenAI-compatible API.
What it does
Connects GMI Cloud-hosted models to your pipeline via an OpenAI-compatible API. GMI Cloud
runs 100+ open-weight and proxied proprietary models on H100/H200 infrastructure. Used
primarily as an llm invoke connection by agents and other nodes that need an LLM. Can
also be used directly via lanes.
Built on langchain-openai's ChatOpenAI client pointed at the GMI Cloud endpoint,
with temperature fixed at 0. Rate-limit and connection errors are treated as
retryable; authentication and other API errors are mapped to friendly messages
(e.g. Invalid API key.).
Two safety behaviors are enforced on the endpoint URL: it must use HTTPS, and the
hostname must be gmi-serving.com or a subdomain of it (an SSRF guard on user-supplied
endpoints). This is checked both at save time and when the pipeline starts. An API key is
required at pipeline start; the shared endpoint https://api.gmi-serving.com/v1 is used
when no endpoint URL is configured.
Configuration
Lanes
| Lane in | Lane out | Description |
|---|---|---|
questions | answers | Send a question directly, receive a generated answer |
Fields
| Field | Type | Description |
|---|---|---|
model | string | GMI Cloud model identifier. Use org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1, Qwen/Qwen3-32B-FP8) or provider/model-name for proxied models (e.g. openai/gpt-4o, anthropic/claude-sonnet-4.5, google/gemini-3-flash-preview). Full list: https://www.gmicloud.ai/models |
modelTotalTokens | number | Total Tokens |
serverbase | string | Your GMI Cloud deployment endpoint URL. Deploy the model at console.gmicloud.ai, then paste the provided endpoint URL here. |
profile | string | Default "deepseek-v3". GMI Cloud LLM model |
The custom profile additionally exposes the raw model fields:
| Field | Type/Default | Description |
|---|---|---|
model | string | GMI Cloud model identifier: org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1) or provider/model-name for proxied models (e.g. openai/gpt-4o). Full list: https://www.gmicloud.ai/models |
modelTotalTokens | number, default 16384 | Total token (context) limit for the model |
serverbase | string, default https://api.gmi-serving.com/v1 | Endpoint URL |
Model tiers
GMI Cloud has three tiers:
Shared (always-on), available immediately at the shared endpoint, API key only:
| Profile | Model | Context |
|---|---|---|
| DeepSeek V3 (default) | deepseek-ai/DeepSeek-V3-0324 | 163,840 |
| DeepSeek V3.2 | deepseek-ai/DeepSeek-V3.2 | 163,840 |
| DeepSeek R1 | deepseek-ai/DeepSeek-R1 | 131,072 |
| DeepSeek Prover V2 | deepseek-ai/DeepSeek-Prover-V2-671B | 131,072 |
| GPT-5.2 | openai/gpt-5.2 | 128,000 |
| GPT-5.1 | openai/gpt-5.1 | 128,000 |
| GPT-5 | openai/gpt-5 | 128,000 |
| GPT-4o | openai/gpt-4o | 128,000 |
| Claude Opus 4.5 | anthropic/claude-opus-4.5 | 200,000 |
| Claude Sonnet 4.5 | anthropic/claude-sonnet-4.5 | 200,000 |
| Gemini 3.1 Pro | google/gemini-3.1-pro-preview | 128,000 |
| Gemini 3 Flash | google/gemini-3-flash-preview | 128,000 |
| Gemini 3.1 Flash Lite | google/gemini-3.1-flash-lite-preview | 128,000 |
Deploy-on-demand: deploy first at console.gmicloud.ai, then paste the provided endpoint URL into the Endpoint URL field:
| Profile | Model | Context |
|---|---|---|
| Llama 4 Scout | meta-llama/Llama-4-Scout-17B-16E-Instruct | 1,048,576 |
| Llama 4 Maverick | meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 1,048,576 |
| Qwen3 235B | Qwen/Qwen3-235B-A22B-FP8 | 131,072 |
| Qwen3 32B | Qwen/Qwen3-32B-FP8 | 131,072 |
| Qwen3 30B | Qwen/Qwen3-30B-A3B | 131,072 |
| Qwen3 Coder 480B | Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 | 131,072 |
| DeepSeek R1 Distill 32B | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 131,072 |
| DeepSeek R1 Distill 1.5B | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 131,072 |
Custom: specify any GMI Cloud model ID, token limit, and endpoint URL directly.
Save-time validation
When the node configuration is saved, it is validated against the live API:
- Validation is skipped when the model or API key is not set yet, or (for deploy-on-demand profiles) when the endpoint URL has not been entered yet.
- The endpoint URL is checked for HTTPS and the
gmi-serving.comdomain. - If the model name looks like a vision/multimodal model (contains
vl,vision,visual, ormultimodal), a warning is raised suggesting a vision node instead, and the API probe is skipped, since vision models may reject text-only requests. - Otherwise a 1-token chat request probes the API to confirm both the API key and the model's existence. An HTTP 429 (rate limit) during the probe means the key was accepted and is treated as valid. Other API errors surface as warnings with the HTTP status, provider error type, and message.
Authentication
Provide your GMI Cloud API key in the API Key field. For shared-tier models that is all that is required. For deploy-on-demand models (Llama, Qwen, the R1 Distills), deploy the model in the GMI Cloud console first, then paste the unique endpoint URL it gives you into the Endpoint URL field.
Upstream docs
Schema
| Field | Type | Description | Default |
|---|---|---|---|
gmi_cloud.profile | string | Model GMI Cloud LLM model | "deepseek-v3" |
gmi_cloud.serverbase | string | Endpoint URL Your GMI Cloud deployment endpoint URL. Deploy the model at console.gmicloud.ai, then paste the provided endpoint URL here. | |
model | string | Model GMI Cloud model identifier. Use org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1, Qwen/Qwen3-32B-FP8) or provider/model-name for proxied models (e.g. openai/gpt-4o, anthropic/claude-sonnet-4.5, google/gemini-3-flash-preview). Full list: https://www.gmicloud.ai/models | |
modelTotalTokens | number | Tokens Total Tokens |
Dependencies
openailangchain-openailangchain-corelangchain