Skip to main content
View source

GMI Cloud

View as Markdown

A RocketRide LLM node that connects GMI Cloud-hosted models to a pipeline through GMI Cloud's OpenAI-compatible API.

What it does

Connects GMI Cloud-hosted models to your pipeline via an OpenAI-compatible API. GMI Cloud runs 100+ open-weight and proxied proprietary models on H100/H200 infrastructure. Used primarily as an llm invoke connection by agents and other nodes that need an LLM. Can also be used directly via lanes.

Built on langchain-openai's ChatOpenAI client pointed at the GMI Cloud endpoint, with temperature fixed at 0. Rate-limit and connection errors are treated as retryable; authentication and other API errors are mapped to friendly messages (e.g. Invalid API key.).

Two safety behaviors are enforced on the endpoint URL: it must use HTTPS, and the hostname must be gmi-serving.com or a subdomain of it (an SSRF guard on user-supplied endpoints). This is checked both at save time and when the pipeline starts. An API key is required at pipeline start; the shared endpoint https://api.gmi-serving.com/v1 is used when no endpoint URL is configured.


Configuration

Lanes

Lane inLane outDescription
questionsanswersSend a question directly, receive a generated answer

Fields

FieldTypeDescription
modelstringGMI Cloud model identifier. Use org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1, Qwen/Qwen3-32B-FP8) or provider/model-name for proxied models (e.g. openai/gpt-4o, anthropic/claude-sonnet-4.5, google/gemini-3-flash-preview). Full list: https://www.gmicloud.ai/models
modelTotalTokensnumberTotal Tokens
serverbasestringYour GMI Cloud deployment endpoint URL. Deploy the model at console.gmicloud.ai, then paste the provided endpoint URL here.
profilestringDefault "deepseek-v3". GMI Cloud LLM model

The custom profile additionally exposes the raw model fields:

FieldType/DefaultDescription
modelstringGMI Cloud model identifier: org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1) or provider/model-name for proxied models (e.g. openai/gpt-4o). Full list: https://www.gmicloud.ai/models
modelTotalTokensnumber, default 16384Total token (context) limit for the model
serverbasestring, default https://api.gmi-serving.com/v1Endpoint URL

Model tiers

GMI Cloud has three tiers:

Shared (always-on), available immediately at the shared endpoint, API key only:

ProfileModelContext
DeepSeek V3 (default)deepseek-ai/DeepSeek-V3-0324163,840
DeepSeek V3.2deepseek-ai/DeepSeek-V3.2163,840
DeepSeek R1deepseek-ai/DeepSeek-R1131,072
DeepSeek Prover V2deepseek-ai/DeepSeek-Prover-V2-671B131,072
GPT-5.2openai/gpt-5.2128,000
GPT-5.1openai/gpt-5.1128,000
GPT-5openai/gpt-5128,000
GPT-4oopenai/gpt-4o128,000
Claude Opus 4.5anthropic/claude-opus-4.5200,000
Claude Sonnet 4.5anthropic/claude-sonnet-4.5200,000
Gemini 3.1 Progoogle/gemini-3.1-pro-preview128,000
Gemini 3 Flashgoogle/gemini-3-flash-preview128,000
Gemini 3.1 Flash Litegoogle/gemini-3.1-flash-lite-preview128,000

Deploy-on-demand: deploy first at console.gmicloud.ai, then paste the provided endpoint URL into the Endpoint URL field:

ProfileModelContext
Llama 4 Scoutmeta-llama/Llama-4-Scout-17B-16E-Instruct1,048,576
Llama 4 Maverickmeta-llama/Llama-4-Maverick-17B-128E-Instruct-FP81,048,576
Qwen3 235BQwen/Qwen3-235B-A22B-FP8131,072
Qwen3 32BQwen/Qwen3-32B-FP8131,072
Qwen3 30BQwen/Qwen3-30B-A3B131,072
Qwen3 Coder 480BQwen/Qwen3-Coder-480B-A35B-Instruct-FP8131,072
DeepSeek R1 Distill 32Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-32B131,072
DeepSeek R1 Distill 1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B131,072

Custom: specify any GMI Cloud model ID, token limit, and endpoint URL directly.


Save-time validation

When the node configuration is saved, it is validated against the live API:

  • Validation is skipped when the model or API key is not set yet, or (for deploy-on-demand profiles) when the endpoint URL has not been entered yet.
  • The endpoint URL is checked for HTTPS and the gmi-serving.com domain.
  • If the model name looks like a vision/multimodal model (contains vl, vision, visual, or multimodal), a warning is raised suggesting a vision node instead, and the API probe is skipped, since vision models may reject text-only requests.
  • Otherwise a 1-token chat request probes the API to confirm both the API key and the model's existence. An HTTP 429 (rate limit) during the probe means the key was accepted and is treated as valid. Other API errors surface as warnings with the HTTP status, provider error type, and message.

Authentication

Provide your GMI Cloud API key in the API Key field. For shared-tier models that is all that is required. For deploy-on-demand models (Llama, Qwen, the R1 Distills), deploy the model in the GMI Cloud console first, then paste the unique endpoint URL it gives you into the Endpoint URL field.


Upstream docs


Schema

FieldTypeDescriptionDefault
gmi_cloud.profilestringModel
GMI Cloud LLM model
"deepseek-v3"
gmi_cloud.serverbasestringEndpoint URL
Your GMI Cloud deployment endpoint URL. Deploy the model at console.gmicloud.ai, then paste the provided endpoint URL here.
modelstringModel
GMI Cloud model identifier. Use org/model-name for open-weight models (e.g. deepseek-ai/DeepSeek-R1, Qwen/Qwen3-32B-FP8) or provider/model-name for proxied models (e.g. openai/gpt-4o, anthropic/claude-sonnet-4.5, google/gemini-3-flash-preview). Full list: https://www.gmicloud.ai/models
modelTotalTokensnumberTokens
Total Tokens

Dependencies

  • openai
  • langchain-openai
  • langchain-core
  • langchain