MiniMax

View as Markdown

A RocketRide LLM node that connects MiniMax models to a pipeline via the MiniMax cloud API or a self-hosted OpenAI-compatible server.

What it does

Provides MiniMax chat models as an llm invoke connection for agents and other nodes that need an LLM, and can also be used directly via lanes. It works against the MiniMax cloud API or a self-hosted OpenAI-compatible server (vLLM, SGLang, MLX, or Ollama).

The MiniMax API is OpenAI-compatible, so the node uses langchain-openai (ChatOpenAI) pointed at the configured base URL, with temperature: 0 and the profile's output-token limit as max_tokens. Config validation at save time runs a minimal one-token probe through the openai SDK and surfaces provider errors as warnings.

MiniMax M2-series models return chain-of-thought wrapped in <think>...</think> inside the content field; the node strips that block so downstream pipeline nodes only see the final answer.

Configuration

Lanes

Lane in	Lane out	Description
`questions`	`answers`	Send a question directly, receive a generated answer

Fields

Field	Type / Default	Description
`profile`	enum, default `minimax-m2`	MiniMax model profile (see below)
`model`	string (set by profile)	MiniMax model id (editable on the Custom profile)
`modelTotalTokens`	number (set by profile)	Total token (context) budget for the model
`serverbase`	string, default `https://api.minimax.io/v1`	OpenAI-compatible base URL: `https://api.minimax.io/v1` for cloud (international), `https://api.minimaxi.com/v1` for China, local URLs for self-hosted servers (Custom and Local profiles only)
`apikey`	string	MiniMax API key (cloud profiles only; local profiles don't require one, the node passes a dummy token)

Profiles

Cloud

Profile	Model	Context
MiniMax M3	`MiniMax-M3`	1M tokens
MiniMax M2 (default)	`MiniMax-M2`	200K tokens
MiniMax M2.1	`MiniMax-M2.1`	200K tokens
MiniMax M2.1 Highspeed	`MiniMax-M2.1-highspeed`	200K tokens
MiniMax M2.5	`MiniMax-M2.5`	200K tokens
MiniMax M2.5 Highspeed	`MiniMax-M2.5-highspeed`	200K tokens
MiniMax M2.7	`MiniMax-M2.7`	200K tokens
MiniMax M2.7 Highspeed	`MiniMax-M2.7-highspeed`	200K tokens
Custom Model	User-defined	User-defined

The -highspeed variants are MiniMax's faster/cheaper tier of the same generation. MiniMax M3 is MiniMax's frontier multimodal coding model, with 5x the M2-family context (1M tokens) and a 128K-token recommended output limit (max 512K). M3 is multimodal at the API level (text + image + video), though the llm_minimax node only exposes the text path.

Local deploy

Defaults target vLLM / SGLang on http://localhost:8000/v1 with the HuggingFace model path, which is the configuration MiniMax itself documents in its local deployment guide.

Profile	Model (HF path)	Server base URL (default)	Context
MiniMax M2 (Local)	`MiniMaxAI/MiniMax-M2`	`http://localhost:8000/v1`	200K tokens
MiniMax M2.5 (Local)	`MiniMaxAI/MiniMax-M2.5`	`http://localhost:8000/v1`	200K tokens
MiniMax M2.7 (Local)	`MiniMaxAI/MiniMax-M2.7`	`http://localhost:8000/v1`	200K tokens

Hardware notes. MiniMax's open-weight models are MIT-licensed but large: M2 / M2.5 / M2.7 are all 230B-parameter MoE architectures (~10B active per token). The recommended local setups are:

Linux + GPU (>=96 GB VRAM total): vLLM or SGLang on port 8000. Use the HF model path as shown above.
Apple Silicon Mac Studio (>=128 GB unified memory): MLX on port 8080. Edit the Server base URL to http://localhost:8080/v1 and change the model to a quantized MLX build, e.g. mlx-community/MiniMax-M2.7-4bit.
Ollama (<128 GB systems, fallback only): listed in MiniMax's docs as an alternative for low-memory setups. Edit the Server base URL to http://localhost:11434/v1 and the model to whatever tag you pulled (verify with ollama pull <tag> before use; tags may not yet exist for every M2 variant).

These models will not fit on a typical laptop without aggressive quantization. M2.7 is the only variant whose local-deploy steps are formally documented today; the M2 and M2.5 entries are scaffolded against the same HuggingFace naming so they work as soon as their upstream guides land. M2.7 is a reasoning model: its responses split message.content (final answer) from message.reasoning_content (chain of thought), so set generous output token budgets (max_tokens >= ~200) even for short prompts.

Authentication

Cloud profiles require a MiniMax API key in apikey. The key requirement is enforced by base-URL match: if serverbase contains api.minimax (covers both api.minimax.io international and api.minimaxi.com China) and no key is set, the node raises MiniMax API key is required for cloud profiles. at startup.

Local profiles (vLLM / SGLang / MLX / Ollama) have no apikey field; local OpenAI-compatible servers accept any token, so the node passes a dummy key (sk-local-dummy-key).

Upstream docs

Schema

Field	Type	Description	Default
`minimax.profile`	`string`	Model MiniMax LLM model	`"minimax-m2"`
`minimax.serverbase`	`string`	Server base URL OpenAI-compatible base URL for the MiniMax endpoint (e.g. https://api.minimax.io/v1 for international, https://api.minimaxi.com/v1 for China).	`"https://api.minimax.io/v1"`
`model`	`string`	Model MiniMax model
`modelTotalTokens`	`number`	Tokens Total Tokens

Dependencies

openai
langchain-openai
langchain-core
langchain

What it does​

Configuration​

Lanes​

Fields​

Profiles​

Cloud​

Local deploy​

Authentication​

Upstream docs​

Schema​

Dependencies​