Skip to main content
View source

MiniMax

View as Markdown

A RocketRide LLM node that connects MiniMax models to a pipeline via the MiniMax cloud API or a self-hosted OpenAI-compatible server.

What it does

Provides MiniMax chat models as an llm invoke connection for agents and other nodes that need an LLM, and can also be used directly via lanes. It works against the MiniMax cloud API or a self-hosted OpenAI-compatible server (vLLM, SGLang, MLX, or Ollama).

The MiniMax API is OpenAI-compatible, so the node uses langchain-openai (ChatOpenAI) pointed at the configured base URL, with temperature: 0 and the profile's output-token limit as max_tokens. Config validation at save time runs a minimal one-token probe through the openai SDK and surfaces provider errors as warnings.

MiniMax M2-series models return chain-of-thought wrapped in <think>...</think> inside the content field; the node strips that block so downstream pipeline nodes only see the final answer.


Configuration

Lanes

Lane inLane outDescription
questionsanswersSend a question directly, receive a generated answer

Fields

FieldType / DefaultDescription
profileenum, default minimax-m2MiniMax model profile (see below)
modelstring (set by profile)MiniMax model id (editable on the Custom profile)
modelTotalTokensnumber (set by profile)Total token (context) budget for the model
serverbasestring, default https://api.minimax.io/v1OpenAI-compatible base URL: https://api.minimax.io/v1 for cloud (international), https://api.minimaxi.com/v1 for China, local URLs for self-hosted servers (Custom and Local profiles only)
apikeystringMiniMax API key (cloud profiles only; local profiles don't require one, the node passes a dummy token)

Profiles

Cloud

ProfileModelContext
MiniMax M3MiniMax-M31M tokens
MiniMax M2 (default)MiniMax-M2200K tokens
MiniMax M2.1MiniMax-M2.1200K tokens
MiniMax M2.1 HighspeedMiniMax-M2.1-highspeed200K tokens
MiniMax M2.5MiniMax-M2.5200K tokens
MiniMax M2.5 HighspeedMiniMax-M2.5-highspeed200K tokens
MiniMax M2.7MiniMax-M2.7200K tokens
MiniMax M2.7 HighspeedMiniMax-M2.7-highspeed200K tokens
Custom ModelUser-definedUser-defined

The -highspeed variants are MiniMax's faster/cheaper tier of the same generation. MiniMax M3 is MiniMax's frontier multimodal coding model, with 5x the M2-family context (1M tokens) and a 128K-token recommended output limit (max 512K). M3 is multimodal at the API level (text + image + video), though the llm_minimax node only exposes the text path.

Local deploy

Defaults target vLLM / SGLang on http://localhost:8000/v1 with the HuggingFace model path, which is the configuration MiniMax itself documents in its local deployment guide.

ProfileModel (HF path)Server base URL (default)Context
MiniMax M2 (Local)MiniMaxAI/MiniMax-M2http://localhost:8000/v1200K tokens
MiniMax M2.5 (Local)MiniMaxAI/MiniMax-M2.5http://localhost:8000/v1200K tokens
MiniMax M2.7 (Local)MiniMaxAI/MiniMax-M2.7http://localhost:8000/v1200K tokens

Hardware notes. MiniMax's open-weight models are MIT-licensed but large: M2 / M2.5 / M2.7 are all 230B-parameter MoE architectures (~10B active per token). The recommended local setups are:

  • Linux + GPU (>=96 GB VRAM total): vLLM or SGLang on port 8000. Use the HF model path as shown above.
  • Apple Silicon Mac Studio (>=128 GB unified memory): MLX on port 8080. Edit the Server base URL to http://localhost:8080/v1 and change the model to a quantized MLX build, e.g. mlx-community/MiniMax-M2.7-4bit.
  • Ollama (<128 GB systems, fallback only): listed in MiniMax's docs as an alternative for low-memory setups. Edit the Server base URL to http://localhost:11434/v1 and the model to whatever tag you pulled (verify with ollama pull <tag> before use; tags may not yet exist for every M2 variant).

These models will not fit on a typical laptop without aggressive quantization. M2.7 is the only variant whose local-deploy steps are formally documented today; the M2 and M2.5 entries are scaffolded against the same HuggingFace naming so they work as soon as their upstream guides land. M2.7 is a reasoning model: its responses split message.content (final answer) from message.reasoning_content (chain of thought), so set generous output token budgets (max_tokens >= ~200) even for short prompts.


Authentication

Cloud profiles require a MiniMax API key in apikey. The key requirement is enforced by base-URL match: if serverbase contains api.minimax (covers both api.minimax.io international and api.minimaxi.com China) and no key is set, the node raises MiniMax API key is required for cloud profiles. at startup.

Local profiles (vLLM / SGLang / MLX / Ollama) have no apikey field; local OpenAI-compatible servers accept any token, so the node passes a dummy key (sk-local-dummy-key).


Upstream docs


Schema

FieldTypeDescriptionDefault
minimax.profilestringModel
MiniMax LLM model
"minimax-m2"
minimax.serverbasestringServer base URL
OpenAI-compatible base URL for the MiniMax endpoint (e.g. https://api.minimax.io/v1 for international, https://api.minimaxi.com/v1 for China).
"https://api.minimax.io/v1"
modelstringModel
MiniMax model
modelTotalTokensnumberTokens
Total Tokens

Dependencies

  • openai
  • langchain-openai
  • langchain-core
  • langchain