Xiaomi MiMo-V2-Pro の技術仕様

項目	Xiaomi MiMo-V2-Pro
提供元	Xiaomi
モデルID	mimo-v2-pro
モデルファミリー	MiMo-V2
モデル種別	エージェント型基盤モデル／推論モデル
主入力	テキスト
主出力	テキスト
コンテキストウィンドウ	最大 1,000,000 トークン
総パラメータ数	1兆超
アクティブパラメータ数	420億
アーキテクチャ	ハイブリッド・アテンション MoE
リリース時期	2026年3月
ベンチマーク指標	Artificial Analysis Intelligence Index: 世界第8位; PinchBench: 世界第3位

Xiaomi MiMo-V2-Pro とは？

Xiaomi MiMo-V2-Pro は、実世界のエージェント業務向けに設計された Xiaomi のフラッグシップ MiMo モデルです。複雑なワークフローをオーケストレーションし、プロダクションエンジニアリングのタスクを処理し、長く多段のジョブにおいても信頼性高く動作し続けるエージェントシステムの中核となるモデルだと Xiaomi は説明しています。

Xiaomi MiMo-V2-Pro の主な特長

エージェント優先の設計: チャット型の回答だけでなく、ワークフロー、ツール使用、タスク実行に向けて構築。
超長コンテキスト: 最大 1,000,000 トークンをサポートし、巨大なコードベース、長大なドキュメント、長期のタスクトレースに実用的。
大規模 MoE: 総パラメータは 1T 超、アクティブパラメータは 42B。ハイブリッド・アテンションと組み合わせて効率化。
強力なコーディング能力: 社内評価で Claude 4.6 Sonnet を上回ると Xiaomi は述べています。
信頼性の高いツール呼び出し: エージェントのスキャフォールドにおけるツール呼び出しの安定性と精度が向上。
フレームワーク親和性: Xiaomi は、OpenClaw、OpenCode、KiloCode、Blackbox、Cline などのエージェントフレームワークと組み合わせているとしています。

Xiaomi MiMo-V2-Pro のベンチマーク性能

Xiaomi の 2026年3月の資料では、MiMo-V2-Pro は Artificial Analysis Intelligence Index で世界第8位、PinchBench の平均タスク完了率で世界第3位と位置付けられています。Xiaomi はまた ClawEval スコア 61.5 を報告しており、同ベンチマークにおいて Claude Opus 4.6 に近く、GPT-5.2 を上回ると説明しています。

Xiaomi MiMo-V2-Pro と MiMo-V2-Flash と MiMo-V2-Omni の比較

モデル	最適用途	主な違い
MiMo-V2-Flash	高速かつ効率的なテキスト推論	効率重視に調整された小型 MoE モデル; 総 309B / アクティブ 15B
MiMo-V2-Pro	深いエージェント型推論と長尺ワークフロー	1M トークンのコンテキストと 1T+ パラメータを備えたフラッグシップのテキストエージェントモデル
MiMo-V2-Omni	マルチモーダルの理解と実行	テキスト、ビジョン、音声を統合し、マルチモーダルなエージェントタスクに対応

Xiaomi MiMo-V2-Pro を使うべきとき

長いコンテキストが必要な推論、多段のエージェントオーケストレーション、コード中心のワークフロー、プロダクション水準のタスク実行が求められる場合に MiMo-V2-Pro を使用してください。速度よりも深さが重要なときは MiMo-V2-Flash より適しており、ワークロードがマルチモーダルではなくテキスト中心の場合は MiMo-V2-Omni より適しています。

制限事項

MiMo-V2-Pro はテキスト優先のエージェントモデルとして位置付けられているため、ネイティブなマルチモーダル作業は MiMo-V2-Omni がより適しています。あらゆるベンチマーク主導のモデルと同様に、実際の結果はプロンプト設計、ツール品質、およびエージェントをスタックにどう組み込むかに左右されます。

MiMo-V2-Pro is Xiaomi’s flagship agentic model for deeper workflows, while MiMo-V2-Flash is the efficiency-focused sibling. Xiaomi says Pro is built for real-world agent tasks, with over 1 trillion total parameters, 42 billion active parameters, and a 1 million-token context window.

Xiaomi says MiMo-V2-Pro supports up to 1 million tokens of context. That is the key spec to know if you need to keep huge codebases, long documents, or extended task histories in one run.

Yes. Xiaomi positions MiMo-V2-Pro as a model for production engineering tasks, complex workflows, and agent scaffolds. The company also says its coding ability surpasses Claude 4.6 Sonnet in internal evaluations.

Use MiMo-V2-Pro when your workload is text-first and centered on reasoning, code, or tool orchestration. Use MiMo-V2-Omni when you need native multimodal understanding across text, vision, and speech.

Xiaomi reports MiMo-V2-Pro at 61.5 on ClawEval, compared with 66.3 for Claude Opus 4.6 and 50.0 for GPT-5.2 on the same chart. Xiaomi also says Pro is close to Opus 4.6 on general agent performance and ranks #8 globally on the Artificial Analysis Intelligence Index.

MiMo-V2-Pro is optimized for agentic text workflows, so it is not the family member to choose for native multimodal input. For image, video, or speech-heavy jobs, Xiaomi’s MiMo-V2-Omni is the better match.

OpenClaw documents the Xiaomi provider as OpenAI-compatible, which means you can use an OpenAI-style client with Xiaomi’s base URL and model ID. In practice, that makes it straightforward to swap in mimo-v2-pro as the model name while keeping your existing chat-completions flow.

Yes. The 1 million-token context window makes MiMo-V2-Pro a strong fit for very long source documents, support tickets, policy packs, or repository-scale analysis where smaller-context models would truncate too early.