Home/Models/MiniMax/MiniMax M2.5
M

MiniMax M2.5

Input:$0.24/M
Output:$0.96/M
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1 to extend into general office work, reaching fluency in generating and operating Word, Excel, and Powerpoint files, context switching between diverse software environments, and working across different agent and human teams.
New
Commercial Use
Playground
Overview
Features
Pricing
API

Technical specifications of MiniMax‑M2.5

FieldClaim / value
Model nameMiniMax-M2.5 (production release, Feb 12, 2026).
ArchitectureMixture-of-Experts (MoE) Transformer (M2 family).
Total parameters~230 billion (total MoE capacity).
Active (per-inference) parameters~10 billion activated per inference (sparse activation).
Input typesText and code (native support for multi-file code contexts), tool-calling / API tool interfaces (agentic workflows).
Output typesText, structured outputs (JSON/tool calls), code (multi-file), Office artifacts (PPT/Excel/Word via tool chains).
Variants / modesM2.5 (high accuracy/capability) and M2.5-Lightning (same quality, lower latency / higher TPS).

What is MiniMax‑M2.5?

MiniMax‑M2.5 is the M2.x family’s flagship update focused on real‑world productivity and agentic workflows. The release emphasizes improved task decomposition, tool/search integration, code generation fidelity, and token efficiency for extended, multi‑step problems. The model is offered in a standard and a lower‑latency “lightning” variant intended for different deployment trade‑offs.


Main features of MiniMax‑M2.5

  1. Agentic-first design: Improved planning and tool orchestration for multi‑stage tasks (search, tool calls, code execution harnesses).
  2. Token efficiency: Reported reductions in token consumption per task compared to M2.1, enabling lower end‑to‑end costs for long workflows.
  3. Faster end‑to‑end completion: Provider benchmarking reports average task completion times ~37% faster than M2.1 on agentic coding evaluations.
  4. Strong code understanding: Tuned on multi‑language code corpora for robust cross‑language refactors, multi‑file edits, and repository‑scale reasoning.
  5. High throughput serving: Targeted for production deployments with high token/sec profiles; suitable for continuous agent workloads.
  6. Variants for latency vs. power tradeoffs: M2.5‑lightning offers lower latency at lower compute and footprint for interactive scenarios.

Benchmark performance (reported)

Provider‑reported highlights — representative metrics (release):

  • SWE‑Bench Verified: 80.2% (reported pass rate on provider benchmark harnesses)
  • BrowseComp (search & tool use): 76.3%
  • Multi‑SWE‑Bench (multi‑language coding): 51.3%
  • Relative speed / efficiency: ~37% faster end‑to‑end completion vs M2.1 on SWE‑Bench Verified in provider tests; ~20% fewer search/tool rounds in some evaluations.

Interpretation: These numbers place M2.5 in parity with or near industry‑leading agentic/code models on the cited benchmarks. Benchmarks are reported by the provider and reproduced by several ecosystem outlets — treat them as measured under the provider’s harness/configuration unless independently reproduced.


MiniMax‑M2.5 vs peers (concise comparison)

DimensionMiniMax‑M2.5MiniMax M2.1Peer example (Anthropic Opus 4.6)
SWE‑Bench Verified80.2%~71–76% (varies by harness)Comparable (Opus reported near‑top results)
Agentic task speed37% faster vs M2.1 (provider tests)BaselineSimilar speed on specific harnesses
Token efficiencyImproved vs M2.1 (~lower tokens per task)Higher token useCompetitive
Best useProduction agentic workflows, coding pipelinesEarlier generation of same familyStrong at multimodal reasoning and safety‑tuned tasks

Provider note: comparisons derive from release materials and vendor benchmark reports. Small differences can be sensitive to harness, toolchain, and evaluation protocol.

Representative enterprise use cases

  1. Repository‑scale refactors & migration pipelines — preserve intent across multi‑file edits and automated PR patches.
  2. Agentic orchestration for DevOps — orchestrate test runs, CI steps, package installs, and environment diagnostics with tool integrations.
  3. Automated code review & remediation — triage vulnerabilities, propose minimal fixes, and prepare reproducible test cases.
  4. Search‑driven information retrieval — leverage BrowseComp‑level search competence to perform multi‑round exploration and summarization of technical knowledge bases.
  5. Production agents & assistants — continuous agents that require cost‑efficient, stable long‑running inference.

How to access and integrate MiniMax‑M2.5

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

Step 2: Send Requests to minimax-m2.5 API

Select the “minimax-m2.5” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. Where to call it: Chat format.

Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.

Step 3: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and output data.

FAQ

What kind of tasks is MiniMax-M2.5 optimized for?

MiniMax-M2.5 is optimized for real-world productivity and agentic workflows — especially complex coding, multi-stage planning, tool invocation, search, and cross-platform system development. Its training emphasizes handling full development lifecycles from architecture planning to code review and testing.

How does MiniMax-M2.5 compare to previous versions like M2.1?

Compared with M2.1, M2.5 shows significant improvements in task decomposition, token efficiency, and speed — for example completing certain agentic benchmarks about 37% faster and with fewer tokens consumed per task.

What benchmarks does MiniMax-M2.5 achieve on coding and agentic tasks?

M2.5 achieves around 80.2% on SWE-Bench Verified, about 51.3% on Multi-SWE-Bench, and roughly 76.3% on BrowseComp in contexts where task planning and search are enabled — results competitive with flagship models from other providers.

Does MiniMax-M2.5 support multiple programming languages?

Yes — M2.5 was trained on over 10 programming languages including Python, Java, Rust, Go, TypeScript, C/C++, Ruby, and Dart, enabling it to handle diverse coding tasks across ecosystems.

Is MiniMax-M2.5 suitable for full-stack and cross-platform development?

Yes — MiniMax positions M2.5 to handle full-stack projects spanning Web, Android, iOS, Windows, and Mac, covering design, implementation, iteration, and testing phases.

What are the main efficiency and cost advantages of MiniMax-M2.5?

M2.5 can run at high token throughput (e.g., ~100 tokens/sec) with cost efficiencies about 10–20× lower than many frontier models on an output price basis, enabling scalable deployment of agentic workflows.

How do I integrate MiniMax-M2.5 into my application?

MiniMax-M2.5 is available via API endpoints (e.g., standard and high-throughput variants) by specifying minimax-m2.5 as the model in requests.

What are known limitations or ideal scenarios to avoid?

M2.5 excels at coding and agentic tasks; it may be less specialized for purely creative narrative generation compared with dedicated creative models, so for story writing or creative fiction other models might be preferable.

Features for MiniMax M2.5

Explore the key features of MiniMax M2.5, designed to enhance performance and usability. Discover how these capabilities can benefit your projects and improve user experience.

Pricing for MiniMax M2.5

Explore competitive pricing for MiniMax M2.5, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how MiniMax M2.5 can enhance your projects while keeping costs manageable.
Comet Price (USD / M Tokens)Official Price (USD / M Tokens)Discount
Input:$0.24/M
Output:$0.96/M
Input:$0.3/M
Output:$1.2/M
-20%

Sample code and API for MiniMax M2.5

Access comprehensive sample code and API resources for MiniMax M2.5 to streamline your integration process. Our detailed documentation provides step-by-step guidance, helping you leverage the full potential of MiniMax M2.5 in your projects.
Python
JavaScript
Curl
from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)

completion = client.chat.completions.create(
    model="minimax-m2.5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a one-sentence introduction to MiniMax M2.5."},
    ],
)

print(completion.choices[0].message.content)

More Models