MiniMax
A RocketRide LLM node that connects MiniMax models to a pipeline via the MiniMax cloud API or a self-hosted OpenAI-compatible server.
What it does
Provides MiniMax chat models as an llm invoke connection for agents and other nodes that need an LLM, and can also be used directly via lanes. It works against the MiniMax cloud API or a self-hosted OpenAI-compatible server (vLLM, SGLang, MLX, or Ollama).
The MiniMax API is OpenAI-compatible, so the node uses langchain-openai (ChatOpenAI) pointed at the configured base URL, with temperature: 0 and the profile's output-token limit as max_tokens. Config validation at save time runs a minimal one-token probe through the openai SDK and surfaces provider errors as warnings.
MiniMax M2-series models return chain-of-thought wrapped in <think>...</think> inside the content field; the node strips that block so downstream pipeline nodes only see the final answer.
Configuration
Lanes
| Lane in | Lane out | Description |
|---|---|---|
questions | answers | Send a question directly, receive a generated answer |
Fields
| Field | Type / Default | Description |
|---|---|---|
profile | enum, default minimax-m2 | MiniMax model profile (see below) |
model | string (set by profile) | MiniMax model id (editable on the Custom profile) |
modelTotalTokens | number (set by profile) | Total token (context) budget for the model |
serverbase | string, default https://api.minimax.io/v1 | OpenAI-compatible base URL: https://api.minimax.io/v1 for cloud (international), https://api.minimaxi.com/v1 for China, local URLs for self-hosted servers (Custom and Local profiles only) |
apikey | string | MiniMax API key (cloud profiles only; local profiles don't require one, the node passes a dummy token) |
Profiles
Cloud
| Profile | Model | Context |
|---|---|---|
| MiniMax M3 | MiniMax-M3 | 1M tokens |
| MiniMax M2 (default) | MiniMax-M2 | 200K tokens |
| MiniMax M2.1 | MiniMax-M2.1 | 200K tokens |
| MiniMax M2.1 Highspeed | MiniMax-M2.1-highspeed | 200K tokens |
| MiniMax M2.5 | MiniMax-M2.5 | 200K tokens |
| MiniMax M2.5 Highspeed | MiniMax-M2.5-highspeed | 200K tokens |
| MiniMax M2.7 | MiniMax-M2.7 | 200K tokens |
| MiniMax M2.7 Highspeed | MiniMax-M2.7-highspeed | 200K tokens |
| Custom Model | User-defined | User-defined |
The -highspeed variants are MiniMax's faster/cheaper tier of the same generation. MiniMax M3 is MiniMax's frontier multimodal coding model, with 5x the M2-family context (1M tokens) and a 128K-token recommended output limit (max 512K). M3 is multimodal at the API level (text + image + video), though the llm_minimax node only exposes the text path.
Local deploy
Defaults target vLLM / SGLang on http://localhost:8000/v1 with the HuggingFace model path, which is the configuration MiniMax itself documents in its local deployment guide.
| Profile | Model (HF path) | Server base URL (default) | Context |
|---|---|---|---|
| MiniMax M2 (Local) | MiniMaxAI/MiniMax-M2 | http://localhost:8000/v1 | 200K tokens |
| MiniMax M2.5 (Local) | MiniMaxAI/MiniMax-M2.5 | http://localhost:8000/v1 | 200K tokens |
| MiniMax M2.7 (Local) | MiniMaxAI/MiniMax-M2.7 | http://localhost:8000/v1 | 200K tokens |
Hardware notes. MiniMax's open-weight models are MIT-licensed but large: M2 / M2.5 / M2.7 are all 230B-parameter MoE architectures (~10B active per token). The recommended local setups are:
- Linux + GPU (>=96 GB VRAM total): vLLM or SGLang on port
8000. Use the HF model path as shown above. - Apple Silicon Mac Studio (>=128 GB unified memory): MLX on port
8080. Edit the Server base URL tohttp://localhost:8080/v1and change the model to a quantized MLX build, e.g.mlx-community/MiniMax-M2.7-4bit. - Ollama (<128 GB systems, fallback only): listed in MiniMax's docs as an alternative for low-memory setups. Edit the Server base URL to
http://localhost:11434/v1and the model to whatever tag you pulled (verify withollama pull <tag>before use; tags may not yet exist for every M2 variant).
These models will not fit on a typical laptop without aggressive quantization. M2.7 is the only variant whose local-deploy steps are formally documented today; the M2 and M2.5 entries are scaffolded against the same HuggingFace naming so they work as soon as their upstream guides land. M2.7 is a reasoning model: its responses split message.content (final answer) from message.reasoning_content (chain of thought), so set generous output token budgets (max_tokens >= ~200) even for short prompts.
Authentication
Cloud profiles require a MiniMax API key in apikey. The key requirement is enforced by base-URL match: if serverbase contains api.minimax (covers both api.minimax.io international and api.minimaxi.com China) and no key is set, the node raises MiniMax API key is required for cloud profiles. at startup.
Local profiles (vLLM / SGLang / MLX / Ollama) have no apikey field; local OpenAI-compatible servers accept any token, so the node passes a dummy key (sk-local-dummy-key).
Upstream docs
- MiniMax platform documentation
- MiniMax API reference (OpenAI-compatible)
- MiniMax local deployment guide
Schema
| Field | Type | Description | Default |
|---|---|---|---|
minimax.profile | string | Model MiniMax LLM model | "minimax-m2" |
minimax.serverbase | string | Server base URL OpenAI-compatible base URL for the MiniMax endpoint (e.g. https://api.minimax.io/v1 for international, https://api.minimaxi.com/v1 for China). | "https://api.minimax.io/v1" |
model | string | Model MiniMax model | |
modelTotalTokens | number | Tokens Total Tokens |
Dependencies
openailangchain-openailangchain-corelangchain