ModelsPricingEnterprise
500+ AI Model API, All In One API.Just In CometAPI
Models API
Developer
Quick StartDocumentationAPI Dashboard
Company
About usEnterprise
Resources
AI ModelsBlogChangelogSupport
Terms of ServicePrivacy Policy
© 2026 CometAPI · All rights reserved
Home/Models/Xiaomi/mimo-v2-pro
X

mimo-v2-pro

Input:$0.8/M
Output:$2.4/M
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.
New
Commercial Use
Playground
Overview
Features
Pricing
API

Technical specifications of Xiaomi MiMo-V2-Pro

ItemXiaomi MiMo-V2-Pro
ProviderXiaomi
Model IDmimo-v2-pro
Model familyMiMo-V2
Model typeAgentic foundation model / reasoning model
Primary inputText
Primary outputText
Context windowUp to 1,000,000 tokens
Total parametersOver 1 trillion
Active parameters42 billion
ArchitectureHybrid-attention MoE
Release windowMarch 2026
Benchmark signalArtificial Analysis Intelligence Index: #8 globally; PinchBench: #3 globally

What is Xiaomi MiMo-V2-Pro?

Xiaomi MiMo-V2-Pro is Xiaomi’s flagship MiMo model for real-world agentic work. Xiaomi describes it as the model behind agent systems that orchestrate complex workflows, handle production engineering tasks, and keep operating reliably across long, multi-step jobs.

Main features of Xiaomi MiMo-V2-Pro

  • Agent-first design: built for workflows, tool use, and task execution rather than only chat-style answers.
  • Ultra-long context: supports up to 1 million tokens, which makes it practical for huge codebases, long documents, and extended task traces.
  • Large MoE scale: more than 1T total parameters with 42B active parameters, paired with hybrid attention for efficiency.
  • Strong coding ability: Xiaomi says its coding performance surpasses Claude 4.6 Sonnet in internal evaluations.
  • Reliable tool calling: Xiaomi highlights improved tool-call stability and accuracy for agent scaffolds.
  • Framework-friendly: Xiaomi says the model is being paired with agent frameworks such as OpenClaw, OpenCode, KiloCode, Blackbox, and Cline.

Benchmark performance of Xiaomi MiMo-V2-Pro

Xiaomi’s March 2026 materials place MiMo-V2-Pro at #8 worldwide on the Artificial Analysis Intelligence Index and #3 globally on PinchBench average task completion rate. Xiaomi also reports a ClawEval score of 61.5, which it describes as close to Claude Opus 4.6 and ahead of GPT-5.2 on that benchmark.

Xiaomi MiMo-V2-Pro vs MiMo-V2-Flash vs MiMo-V2-Omni

ModelBest forKey difference
MiMo-V2-FlashFast, efficient text reasoningSmaller MoE model tuned for efficiency; 309B total / 15B active parameters
MiMo-V2-ProDeep agentic reasoning and long workflowsFlagship text agent model with 1M-token context and 1T+ parameters
MiMo-V2-OmniMultimodal understanding + executionUnifies text, vision, and speech for multimodal agent tasks

When to use Xiaomi MiMo-V2-Pro

Use MiMo-V2-Pro when you need long-context reasoning, multi-step agent orchestration, code-heavy workflows, or production-style task execution. It is the better fit than MiMo-V2-Flash when depth matters more than speed, and the better fit than MiMo-V2-Omni when your workload is text-first instead of multimodal.

Limitations

MiMo-V2-Pro is positioned as a text-first agent model, so native multimodal work is better handled by MiMo-V2-Omni. As with any benchmark-led model, real results will still depend on prompt design, tool quality, and how the agent is wired into your stack.

FAQ

What makes Xiaomi MiMo-V2-Pro API different from MiMo-V2-Flash?

MiMo-V2-Pro is Xiaomi’s flagship agentic model for deeper workflows, while MiMo-V2-Flash is the efficiency-focused sibling. Xiaomi says Pro is built for real-world agent tasks, with over 1 trillion total parameters, 42 billion active parameters, and a 1 million-token context window.

How large is the Xiaomi MiMo-V2-Pro API context window?

Xiaomi says MiMo-V2-Pro supports up to 1 million tokens of context. That is the key spec to know if you need to keep huge codebases, long documents, or extended task histories in one run.

Can Xiaomi MiMo-V2-Pro API handle coding and multi-step agent workflows?

Yes. Xiaomi positions MiMo-V2-Pro as a model for production engineering tasks, complex workflows, and agent scaffolds. The company also says its coding ability surpasses Claude 4.6 Sonnet in internal evaluations.

When should I use Xiaomi MiMo-V2-Pro API instead of MiMo-V2-Omni?

Use MiMo-V2-Pro when your workload is text-first and centered on reasoning, code, or tool orchestration. Use MiMo-V2-Omni when you need native multimodal understanding across text, vision, and speech.

How does Xiaomi MiMo-V2-Pro API compare with Claude Opus 4.6 and GPT-5.2?

Xiaomi reports MiMo-V2-Pro at 61.5 on ClawEval, compared with 66.3 for Claude Opus 4.6 and 50.0 for GPT-5.2 on the same chart. Xiaomi also says Pro is close to Opus 4.6 on general agent performance and ranks #8 globally on the Artificial Analysis Intelligence Index.

What are the known limitations of Xiaomi MiMo-V2-Pro API?

MiMo-V2-Pro is optimized for agentic text workflows, so it is not the family member to choose for native multimodal input. For image, video, or speech-heavy jobs, Xiaomi’s MiMo-V2-Omni is the better match.

How do I integrate Xiaomi MiMo-V2-Pro API with an OpenAI-compatible client?

OpenClaw documents the Xiaomi provider as OpenAI-compatible, which means you can use an OpenAI-style client with Xiaomi’s base URL and model ID. In practice, that makes it straightforward to swap in mimo-v2-pro as the model name while keeping your existing chat-completions flow.

Is Xiaomi MiMo-V2-Pro API suitable for long document analysis?

Yes. The 1 million-token context window makes MiMo-V2-Pro a strong fit for very long source documents, support tickets, policy packs, or repository-scale analysis where smaller-context models would truncate too early.

Features for mimo-v2-pro

Explore the key features of mimo-v2-pro, designed to enhance performance and usability. Discover how these capabilities can benefit your projects and improve user experience.

Pricing for mimo-v2-pro

Explore competitive pricing for mimo-v2-pro, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how mimo-v2-pro can enhance your projects while keeping costs manageable.
Comet Price (USD / M Tokens)Official Price (USD / M Tokens)Discount
Input:$0.8/M
Output:$2.4/M
Input:$1/M
Output:$3/M
-20%

Sample code and API for mimo-v2-pro

Access comprehensive sample code and API resources for mimo-v2-pro to streamline your integration process. Our detailed documentation provides step-by-step guidance, helping you leverage the full potential of mimo-v2-pro in your projects.
POST
/v1/chat/completions
POST
/v1/messages
Python
JavaScript
Curl
from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("
</thinking>
")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("\n</thinking>\n")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY || "<YOUR_COMETAPI_KEY>";

const client = new OpenAI({ apiKey: api_key, baseURL: "https://api.cometapi.com/v1" });

const stream = await client.chat.completions.create({
  model: "mimo-v2-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain the Monty Hall problem step by step." },
  ],
  stream: true,
  thinking: { type: "enabled" },
});

let thinking = false;
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta ?? {};
  const reasoning = delta.reasoning_content;
  if (reasoning) {
    if (!thinking) { process.stdout.write("<thinking>\n"); thinking = true; }
    process.stdout.write(reasoning);
  } else if (thinking && delta.content) {
    process.stdout.write("\n</thinking>\n\n");
    thinking = false;
    process.stdout.write(delta.content);
  } else if (delta.content) {
    process.stdout.write(delta.content);
  }
}

Curl Code Example

# Get your CometAPI key from https://api.cometapi.com/console/token
# Export it as: export COMETAPI_KEY="your-key-here"

curl https://api.cometapi.com/v1/chat/completions \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mimo-v2-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain the Monty Hall problem step by step."}
    ],
    "stream": true,
    "thinking": {"type": "enabled"}
  }'

More Models

C

Claude Opus 4.7

Input:$4/M
Output:$20/M
Claude Opus 4.7 is a hybrid reasoning model designed specifically for frontier-level coding, AI agents, and complex multi-step professional work. Unlike lighter models (e.g., Sonnet or Haiku variants), Opus 4.7 prioritizes depth, consistency, and autonomy on the hardest tasks.
A

Claude Sonnet 4.6

Input:$2.4/M
Output:$12/M
Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta.
X

Grok 4.3

Input:$1/M
Output:$2/M
Excels at agentic reasoning, knowledge work, and tool use.
O

GPT 5.5 Pro

Input:$24/M
Output:$144/M
An advanced model engineered for extremely complex logic and professional demands, representing the highest standard of deep reasoning and precise analytical capabilities.
O

GPT 5.5

Input:$4/M
Output:$24/M
A next-generation multimodal flagship model balancing exceptional performance with efficient response, dedicated to providing comprehensive and stable general-purpose AI services.
O

GPT Image 2 ALL

Per Request:$0.04
GPT Image 2 is openai state-of-the-art image generation model for fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image inputs.

Related Blog

MiMo V2 Pro vs Omni vs Flash: How should I choose in 2026?
Mar 26, 2026
mimo-v2

MiMo V2 Pro vs Omni vs Flash: How should I choose in 2026?

MiMo V2 Pro is the flagship choice for demanding agentic work, MiMo V2 Omni is the multimodal specialist for image, video, audio, and tool-using agents, and MiMo V2 Flash is the fast, low-cost open-source option for reasoning, coding, and everyday agent workflows.
How to Use MiMo V2 API for Free in 2026: Complete Guide (Pro, Omni & Flash)
Mar 25, 2026
mimo-v2

How to Use MiMo V2 API for Free in 2026: Complete Guide (Pro, Omni & Flash)

To use MiMo V2 API for free, get free quota via CometAPI or self-host the open-source weights on Hugging Face. For Pro and Omni, leverage OpenRouter routing, CometAPI aggregation, or Puter.js user-pays proxies. All models use a standard OpenAI-compatible endpoint. Official Xiaomi pricing starts at $1/$3 per million tokens for Pro (cheaper than Claude Opus 4.6), but free tiers and aggregators make high-performance agentic AI accessible without upfront costs.
OpenRouter vs CometAPI: A Comprehensive Comparison
Aug 25, 2025

OpenRouter vs CometAPI: A Comprehensive Comparison

This article provides a comprehensive comparison of OpenRouter and CometAPI across critical dimensions—architecture, model coverage, pricing, performance, security, developer experience, and use cases—to help you determine which platform best aligns with your requirements.