Technical specifications of MiniMax‑M2.5
| Field | Claim / value |
|---|---|
| Model name | MiniMax-M2.5 (production release, Feb 12, 2026). |
| Architecture | Mixture-of-Experts (MoE) Transformer (M2 family). |
| Total parameters | ~230 billion (total MoE capacity). |
| Active (per-inference) parameters | ~10 billion activated per inference (sparse activation). |
| Input types | Text and code (native support for multi-file code contexts), tool-calling / API tool interfaces (agentic workflows). |
| Output types | Text, structured outputs (JSON/tool calls), code (multi-file), Office artifacts (PPT/Excel/Word via tool chains). |
| Variants / modes | M2.5 (high accuracy/capability) and M2.5-Lightning (same quality, lower latency / higher TPS). |
What is MiniMax‑M2.5?
MiniMax‑M2.5 is the M2.x family’s flagship update focused on real‑world productivity and agentic workflows. The release emphasizes improved task decomposition, tool/search integration, code generation fidelity, and token efficiency for extended, multi‑step problems. The model is offered in a standard and a lower‑latency “lightning” variant intended for different deployment trade‑offs.
Main features of MiniMax‑M2.5
- Agentic-first design: Improved planning and tool orchestration for multi‑stage tasks (search, tool calls, code execution harnesses).
- Token efficiency: Reported reductions in token consumption per task compared to M2.1, enabling lower end‑to‑end costs for long workflows.
- Faster end‑to‑end completion: Provider benchmarking reports average task completion times ~37% faster than M2.1 on agentic coding evaluations.
- Strong code understanding: Tuned on multi‑language code corpora for robust cross‑language refactors, multi‑file edits, and repository‑scale reasoning.
- High throughput serving: Targeted for production deployments with high token/sec profiles; suitable for continuous agent workloads.
- Variants for latency vs. power tradeoffs: M2.5‑lightning offers lower latency at lower compute and footprint for interactive scenarios.
Benchmark performance (reported)
Provider‑reported highlights — representative metrics (release):
- SWE‑Bench Verified: 80.2% (reported pass rate on provider benchmark harnesses)
- BrowseComp (search & tool use): 76.3%
- Multi‑SWE‑Bench (multi‑language coding): 51.3%
- Relative speed / efficiency: ~37% faster end‑to‑end completion vs M2.1 on SWE‑Bench Verified in provider tests; ~20% fewer search/tool rounds in some evaluations.
Interpretation: These numbers place M2.5 in parity with or near industry‑leading agentic/code models on the cited benchmarks. Benchmarks are reported by the provider and reproduced by several ecosystem outlets — treat them as measured under the provider’s harness/configuration unless independently reproduced.
MiniMax‑M2.5 vs peers (concise comparison)
| Dimension | MiniMax‑M2.5 | MiniMax M2.1 | Peer example (Anthropic Opus 4.6) |
|---|---|---|---|
| SWE‑Bench Verified | 80.2% | ~71–76% (varies by harness) | Comparable (Opus reported near‑top results) |
| Agentic task speed | 37% faster vs M2.1 (provider tests) | Baseline | Similar speed on specific harnesses |
| Token efficiency | Improved vs M2.1 (~lower tokens per task) | Higher token use | Competitive |
| Best use | Production agentic workflows, coding pipelines | Earlier generation of same family | Strong at multimodal reasoning and safety‑tuned tasks |
Provider note: comparisons derive from release materials and vendor benchmark reports. Small differences can be sensitive to harness, toolchain, and evaluation protocol.
Representative enterprise use cases
- Repository‑scale refactors & migration pipelines — preserve intent across multi‑file edits and automated PR patches.
- Agentic orchestration for DevOps — orchestrate test runs, CI steps, package installs, and environment diagnostics with tool integrations.
- Automated code review & remediation — triage vulnerabilities, propose minimal fixes, and prepare reproducible test cases.
- Search‑driven information retrieval — leverage BrowseComp‑level search competence to perform multi‑round exploration and summarization of technical knowledge bases.
- Production agents & assistants — continuous agents that require cost‑efficient, stable long‑running inference.
How to access and integrate MiniMax‑M2.5
Step 1: Sign Up for API Key
Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.
Step 2: Send Requests to minimax-m2.5 API
Select the “minimax-m2.5” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. Where to call it: Chat format.
Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.
Step 3: Retrieve and Verify Results
Process the API response to get the generated answer. After processing, the API responds with the task status and output data.