MiniMax‑M2.5 的技術規格

欄位	說明 / 值
模型名稱	MiniMax-M2.5（正式版，2026 年 2 月 12 日）。
架構	Mixture-of-Experts（MoE）Transformer（M2 系列）。
總參數	約 2300 億（MoE 總容量）。
啟用參數（每次推理）	每次推理約啟用 100 億（稀疏激活）。
輸入類型	文字與程式碼（原生支援多檔案程式碼上下文）、工具呼叫 / API 工具介面（代理式工作流程）。
輸出類型	文字、結構化輸出（JSON/工具呼叫）、程式碼（多檔案）、Office 成果物（透過工具鏈生成 PPT/Excel/Word）。
變體 / 模式	M2.5（高準確度/能力）與 M2.5-Lightning（同等品質，更低延遲 / 更高 TPS）。

什麼是 MiniMax‑M2.5？

MiniMax‑M2.5 是 M2.x 系列的旗艦更新，聚焦於「真實世界的生產力與代理式工作流程」。本次發佈重點強化任務分解、工具/搜尋整合、程式碼生成忠實度，以及在延展的多步驟問題上的 Token 效率。該模型提供標準版與較低延遲的「lightning」變體，以滿足不同部署取捨。

MiniMax‑M2.5 的主要特性

代理優先設計：針對多階段任務（搜尋、工具呼叫、程式碼執行框架）改進規劃與工具協調。
Token 效率：據報告相較於 M2.1 每項任務的 Token 消耗降低，讓長流程的端到端成本更低。
更快的端到端完成：供應商基準測試顯示，在代理式編碼評估上平均任務完成時間比 M2.1 快約 37%。
強大的程式碼理解能力：以多語言程式碼語料調校，支援穩健的跨語言重構、多檔案編輯與倉庫級推理。
高吞吐量服務：面向高 Token/秒配置的生產部署；適用於持續的代理工作負載。
針對延遲與性能權衡的變體：M2.5‑lightning 在互動場景下以較低的計算與佔用提供更低延遲。

基準測試表現（報告）

供應商報告亮點——代表性指標（版本）：

SWE‑Bench Verified：80.2%（供應商基準測試框架上的通過率）
BrowseComp（搜尋與工具使用）：76.3%
Multi‑SWE‑Bench（多語言編碼）：51.3%
相對速度 / 效率：在供應商測試中，相較 M2.1，於 SWE‑Bench Verified 的端到端完成提升約 37%；在部分評估中搜尋/工具輪次減少約 20%。

解讀：上述數據顯示 M2.5 在所述基準上與業界領先的代理/程式碼模型相當或接近。基準由供應商報告，並由多個生態渠道轉載——除非獨立復現，應視為在供應商的測試架構/設定下測得。

MiniMax‑M2.5 與同類（簡要比較）

維度	MiniMax‑M2.5	MiniMax M2.1	同業範例（Anthropic Opus 4.6）
SWE‑Bench Verified	80.2%	約 71–76%（依測試框架而異）	可比（Opus 報告為近頂尖結果）
代理式任務速度	相較 M2.1 快 37%（供應商測試）	基線	在特定測試框架上速度相近
Token 效率	相較 M2.1 改善（每任務較少 Token）	Token 使用較高	具競爭力
最佳用途	生產級代理式工作流程、編碼管線	同系列的早期世代	擅長多模態推理與安全調校任務

供應商註：比較來源於發佈資料與廠商基準報告。細微差異可能受測試框架、工具鏈與評估協議影響。

代表性的企業用例

倉庫級重構與遷移管線——在多檔案編輯與自動化 PR 修補中保留意圖。
面向 DevOps 的代理式協調——透過工具整合協調測試執行、CI 步驟、套件安裝與環境診斷。
自動化程式碼審查與修復——分級處理漏洞、提出最小修補，並準備可複現的測試案例。
以搜尋驅動的資訊檢索——運用 BrowseComp 等級的搜尋能力，對技術知識庫進行多輪探索與摘要。
生產級代理與助理——適用於需要具成本效率且穩定長時間推理的持續代理。

如何取得並整合 MiniMax‑M2.5

步驟 1：註冊取得 API 金鑰

登入 cometapi.com。若您尚未成為我們的使用者，請先註冊。登入您的 CometAPI 控制台。取得介面的存取憑證 API 金鑰。在個人中心的 API token 處點選「Add Token」，取得 token key：sk-xxxxx 並提交。

步驟 2：向 `minimax-m2.5` API 發送請求

選擇「minimax-m2.5」端點發送 API 請求並設定請求體。請求方法與請求體可從我們網站的 API 文件獲取。我們的網站亦提供 Apifox 測試以供方便。將 <YOUR_API_KEY> 替換為您帳號中的實際 CometAPI 金鑰。調用位置：Chat 格式。

將您的問題或請求插入 content 欄位——模型將對此作出回應。處理 API 響應以獲取生成的答案。

步驟 3：擷取並驗證結果

處理 API 響應以獲取生成的答案。處理後，API 會回傳任務狀態與輸出資料。

MiniMax-M2.5 is optimized for real-world productivity and agentic workflows — especially complex coding, multi-stage planning, tool invocation, search, and cross-platform system development. Its training emphasizes handling full development lifecycles from architecture planning to code review and testing.

Compared with M2.1, M2.5 shows significant improvements in task decomposition, token efficiency, and speed — for example completing certain agentic benchmarks about 37% faster and with fewer tokens consumed per task.

M2.5 achieves around 80.2% on SWE-Bench Verified, about 51.3% on Multi-SWE-Bench, and roughly 76.3% on BrowseComp in contexts where task planning and search are enabled — results competitive with flagship models from other providers.

Yes — M2.5 was trained on over 10 programming languages including Python, Java, Rust, Go, TypeScript, C/C++, Ruby, and Dart, enabling it to handle diverse coding tasks across ecosystems.

Yes — MiniMax positions M2.5 to handle full-stack projects spanning Web, Android, iOS, Windows, and Mac, covering design, implementation, iteration, and testing phases.

M2.5 can run at high token throughput (e.g., ~100 tokens/sec) with cost efficiencies about 10–20× lower than many frontier models on an output price basis, enabling scalable deployment of agentic workflows.

MiniMax-M2.5 is available via API endpoints (e.g., standard and high-throughput variants) by specifying minimax-m2.5 as the model in requests.

M2.5 excels at coding and agentic tasks; it may be less specialized for purely creative narrative generation compared with dedicated creative models, so for story writing or creative fiction other models might be preferable.