mimo-v2-pro

Q: What makes Xiaomi MiMo-V2-Pro API different from MiMo-V2-Flash?

MiMo-V2-Pro is Xiaomi’s flagship agentic model for deeper workflows, while MiMo-V2-Flash is the efficiency-focused sibling. Xiaomi says Pro is built for real-world agent tasks, with over 1 trillion total parameters, 42 billion active parameters, and a 1 million-token context window.

Q: How large is the Xiaomi MiMo-V2-Pro API context window?

Xiaomi says MiMo-V2-Pro supports up to 1 million tokens of context. That is the key spec to know if you need to keep huge codebases, long documents, or extended task histories in one run.

Q: Can Xiaomi MiMo-V2-Pro API handle coding and multi-step agent workflows?

Yes. Xiaomi positions MiMo-V2-Pro as a model for production engineering tasks, complex workflows, and agent scaffolds. The company also says its coding ability surpasses Claude 4.6 Sonnet in internal evaluations.

Q: When should I use Xiaomi MiMo-V2-Pro API instead of MiMo-V2-Omni?

Use MiMo-V2-Pro when your workload is text-first and centered on reasoning, code, or tool orchestration. Use MiMo-V2-Omni when you need native multimodal understanding across text, vision, and speech.

Q: How does Xiaomi MiMo-V2-Pro API compare with Claude Opus 4.6 and GPT-5.2?

Xiaomi reports MiMo-V2-Pro at 61.5 on ClawEval, compared with 66.3 for Claude Opus 4.6 and 50.0 for GPT-5.2 on the same chart. Xiaomi also says Pro is close to Opus 4.6 on general agent performance and ranks #8 globally on the Artificial Analysis Intelligence Index.

Q: What are the known limitations of Xiaomi MiMo-V2-Pro API?

MiMo-V2-Pro is optimized for agentic text workflows, so it is not the family member to choose for native multimodal input. For image, video, or speech-heavy jobs, Xiaomi’s MiMo-V2-Omni is the better match.

Q: How do I integrate Xiaomi MiMo-V2-Pro API with an OpenAI-compatible client?

OpenClaw documents the Xiaomi provider as OpenAI-compatible, which means you can use an OpenAI-style client with Xiaomi’s base URL and model ID. In practice, that makes it straightforward to swap in `mimo-v2-pro` as the model name while keeping your existing chat-completions flow.

Q: Is Xiaomi MiMo-V2-Pro API suitable for long document analysis?

Yes. The 1 million-token context window makes MiMo-V2-Pro a strong fit for very long source documents, support tickets, policy packs, or repository-scale analysis where smaller-context models would truncate too early.

Input:$0.8/M

Output:$2.4/M

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.

New

Commercial Use

Playground

Overview

Features

Pricing

API

Technical specifications of Xiaomi MiMo-V2-Pro

Item	Xiaomi MiMo-V2-Pro
Provider	Xiaomi
Model ID	mimo-v2-pro
Model family	MiMo-V2
Model type	Agentic foundation model / reasoning model
Primary input	Text
Primary output	Text
Context window	Up to 1,000,000 tokens
Total parameters	Over 1 trillion
Active parameters	42 billion
Architecture	Hybrid-attention MoE
Release window	March 2026
Benchmark signal	Artificial Analysis Intelligence Index: #8 globally; PinchBench: #3 globally

What is Xiaomi MiMo-V2-Pro?

Xiaomi MiMo-V2-Pro is Xiaomi’s flagship MiMo model for real-world agentic work. Xiaomi describes it as the model behind agent systems that orchestrate complex workflows, handle production engineering tasks, and keep operating reliably across long, multi-step jobs.

Main features of Xiaomi MiMo-V2-Pro

Agent-first design: built for workflows, tool use, and task execution rather than only chat-style answers.
Ultra-long context: supports up to 1 million tokens, which makes it practical for huge codebases, long documents, and extended task traces.
Large MoE scale: more than 1T total parameters with 42B active parameters, paired with hybrid attention for efficiency.
Strong coding ability: Xiaomi says its coding performance surpasses Claude 4.6 Sonnet in internal evaluations.
Reliable tool calling: Xiaomi highlights improved tool-call stability and accuracy for agent scaffolds.
Framework-friendly: Xiaomi says the model is being paired with agent frameworks such as OpenClaw, OpenCode, KiloCode, Blackbox, and Cline.

Benchmark performance of Xiaomi MiMo-V2-Pro

Xiaomi’s March 2026 materials place MiMo-V2-Pro at #8 worldwide on the Artificial Analysis Intelligence Index and #3 globally on PinchBench average task completion rate. Xiaomi also reports a ClawEval score of 61.5, which it describes as close to Claude Opus 4.6 and ahead of GPT-5.2 on that benchmark.

Xiaomi MiMo-V2-Pro vs MiMo-V2-Flash vs MiMo-V2-Omni

Model	Best for	Key difference
MiMo-V2-Flash	Fast, efficient text reasoning	Smaller MoE model tuned for efficiency; 309B total / 15B active parameters
MiMo-V2-Pro	Deep agentic reasoning and long workflows	Flagship text agent model with 1M-token context and 1T+ parameters
MiMo-V2-Omni	Multimodal understanding + execution	Unifies text, vision, and speech for multimodal agent tasks

When to use Xiaomi MiMo-V2-Pro

Use MiMo-V2-Pro when you need long-context reasoning, multi-step agent orchestration, code-heavy workflows, or production-style task execution. It is the better fit than MiMo-V2-Flash when depth matters more than speed, and the better fit than MiMo-V2-Omni when your workload is text-first instead of multimodal.

Limitations

MiMo-V2-Pro is positioned as a text-first agent model, so native multimodal work is better handled by MiMo-V2-Omni. As with any benchmark-led model, real results will still depend on prompt design, tool quality, and how the agent is wired into your stack.

FAQ

What makes Xiaomi MiMo-V2-Pro API different from MiMo-V2-Flash?

How large is the Xiaomi MiMo-V2-Pro API context window?

Can Xiaomi MiMo-V2-Pro API handle coding and multi-step agent workflows?

When should I use Xiaomi MiMo-V2-Pro API instead of MiMo-V2-Omni?

How does Xiaomi MiMo-V2-Pro API compare with Claude Opus 4.6 and GPT-5.2?

What are the known limitations of Xiaomi MiMo-V2-Pro API?

How do I integrate Xiaomi MiMo-V2-Pro API with an OpenAI-compatible client?

Is Xiaomi MiMo-V2-Pro API suitable for long document analysis?

Pricing for mimo-v2-pro

Explore competitive pricing for mimo-v2-pro, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how mimo-v2-pro can enhance your projects while keeping costs manageable.

Comet Price (USD / M Tokens)	Official Price (USD / M Tokens)	Discount
Input:$0.8/M Output:$2.4/M	Input:$1/M Output:$3/M	-20%

Sample code and API for mimo-v2-pro

Access comprehensive sample code and API resources for mimo-v2-pro to streamline your integration process. Our detailed documentation provides step-by-step guidance, helping you leverage the full potential of mimo-v2-pro in your projects.

POST

/v1/chat/completions

POST

/v1/messages

Python
JavaScript
Curl

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("
</thinking>
")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("\n</thinking>\n")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY || "<YOUR_COMETAPI_KEY>";

const client = new OpenAI({ apiKey: api_key, baseURL: "https://api.cometapi.com/v1" });

const stream = await client.chat.completions.create({
  model: "mimo-v2-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain the Monty Hall problem step by step." },
  ],
  stream: true,
  thinking: { type: "enabled" },
});

let thinking = false;
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta ?? {};
  const reasoning = delta.reasoning_content;
  if (reasoning) {
    if (!thinking) { process.stdout.write("<thinking>\n"); thinking = true; }
    process.stdout.write(reasoning);
  } else if (thinking && delta.content) {
    process.stdout.write("\n</thinking>\n\n");
    thinking = false;
    process.stdout.write(delta.content);
  } else if (delta.content) {
    process.stdout.write(delta.content);
  }
}

Curl Code Example

# Get your CometAPI key from https://api.cometapi.com/console/token
# Export it as: export COMETAPI_KEY="your-key-here"

curl https://api.cometapi.com/v1/chat/completions \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mimo-v2-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain the Monty Hall problem step by step."}
    ],
    "stream": true,
    "thinking": {"type": "enabled"}
  }'

More Models

Related Blog

Mar 26, 2026

MiMo V2 Pro vs Omni vs Flash: How should I choose in 2026?

MiMo V2 Pro is the flagship choice for demanding agentic work, MiMo V2 Omni is the multimodal specialist for image, video, audio, and tool-using agents, and MiMo V2 Flash is the fast, low-cost open-source option for reasoning, coding, and everyday agent workflows.

Mar 25, 2026

How to Use MiMo V2 API for Free in 2026: Complete Guide (Pro, Omni & Flash)

To use MiMo V2 API for free, get free quota via CometAPI or self-host the open-source weights on Hugging Face. For Pro and Omni, leverage OpenRouter routing, CometAPI aggregation, or Puter.js user-pays proxies. All models use a standard OpenAI-compatible endpoint. Official Xiaomi pricing starts at $1/$3 per million tokens for Pro (cheaper than Claude Opus 4.6), but free tiers and aggregators make high-performance agentic AI accessible without upfront costs.

Aug 25, 2025

OpenRouter vs CometAPI: A Comprehensive Comparison

This article provides a comprehensive comparison of OpenRouter and CometAPI across critical dimensions—architecture, model coverage, pricing, performance, security, developer experience, and use cases—to help you determine which platform best aligns with your requirements.

mimo-v2-pro

Input:$0.8/M

Output:$2.4/M

New

Commercial Use

Playground

Overview

Features

Pricing

API

Technical specifications of Xiaomi MiMo-V2-Pro

Item	Xiaomi MiMo-V2-Pro
Provider	Xiaomi
Model ID	mimo-v2-pro
Model family	MiMo-V2
Model type	Agentic foundation model / reasoning model
Primary input	Text
Primary output	Text
Context window	Up to 1,000,000 tokens
Total parameters	Over 1 trillion
Active parameters	42 billion
Architecture	Hybrid-attention MoE
Release window	March 2026
Benchmark signal	Artificial Analysis Intelligence Index: #8 globally; PinchBench: #3 globally

What is Xiaomi MiMo-V2-Pro?

Main features of Xiaomi MiMo-V2-Pro

Agent-first design: built for workflows, tool use, and task execution rather than only chat-style answers.
Ultra-long context: supports up to 1 million tokens, which makes it practical for huge codebases, long documents, and extended task traces.
Large MoE scale: more than 1T total parameters with 42B active parameters, paired with hybrid attention for efficiency.
Strong coding ability: Xiaomi says its coding performance surpasses Claude 4.6 Sonnet in internal evaluations.
Reliable tool calling: Xiaomi highlights improved tool-call stability and accuracy for agent scaffolds.
Framework-friendly: Xiaomi says the model is being paired with agent frameworks such as OpenClaw, OpenCode, KiloCode, Blackbox, and Cline.

Benchmark performance of Xiaomi MiMo-V2-Pro

Xiaomi MiMo-V2-Pro vs MiMo-V2-Flash vs MiMo-V2-Omni

Model	Best for	Key difference
MiMo-V2-Flash	Fast, efficient text reasoning	Smaller MoE model tuned for efficiency; 309B total / 15B active parameters
MiMo-V2-Pro	Deep agentic reasoning and long workflows	Flagship text agent model with 1M-token context and 1T+ parameters
MiMo-V2-Omni	Multimodal understanding + execution	Unifies text, vision, and speech for multimodal agent tasks

When to use Xiaomi MiMo-V2-Pro

Limitations

FAQ

What makes Xiaomi MiMo-V2-Pro API different from MiMo-V2-Flash?

How large is the Xiaomi MiMo-V2-Pro API context window?

Can Xiaomi MiMo-V2-Pro API handle coding and multi-step agent workflows?

When should I use Xiaomi MiMo-V2-Pro API instead of MiMo-V2-Omni?

How does Xiaomi MiMo-V2-Pro API compare with Claude Opus 4.6 and GPT-5.2?

What are the known limitations of Xiaomi MiMo-V2-Pro API?

How do I integrate Xiaomi MiMo-V2-Pro API with an OpenAI-compatible client?

Is Xiaomi MiMo-V2-Pro API suitable for long document analysis?

Pricing for mimo-v2-pro

Comet Price (USD / M Tokens)	Official Price (USD / M Tokens)	Discount
Input:$0.8/M Output:$2.4/M	Input:$1/M Output:$3/M	-20%

Sample code and API for mimo-v2-pro

POST

/v1/chat/completions

POST

/v1/messages

Python
JavaScript
Curl

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("
</thinking>
")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"

client = OpenAI(api_key=COMETAPI_KEY, base_url="https://api.cometapi.com/v1")

stream = client.chat.completions.create(
    model="mimo-v2-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the Monty Hall problem step by step."},
    ],
    stream=True,
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content")
    if reasoning:
        if not thinking:
            print("<thinking>")
            thinking = True
        print(reasoning, end="", flush=True)
    elif thinking and delta.content:
        print("\n</thinking>\n")
        thinking = False
        print(delta.content, end="", flush=True)
    elif delta.content:
        print(delta.content, end="", flush=True)

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY || "<YOUR_COMETAPI_KEY>";

const client = new OpenAI({ apiKey: api_key, baseURL: "https://api.cometapi.com/v1" });

const stream = await client.chat.completions.create({
  model: "mimo-v2-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain the Monty Hall problem step by step." },
  ],
  stream: true,
  thinking: { type: "enabled" },
});

let thinking = false;
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta ?? {};
  const reasoning = delta.reasoning_content;
  if (reasoning) {
    if (!thinking) { process.stdout.write("<thinking>\n"); thinking = true; }
    process.stdout.write(reasoning);
  } else if (thinking && delta.content) {
    process.stdout.write("\n</thinking>\n\n");
    thinking = false;
    process.stdout.write(delta.content);
  } else if (delta.content) {
    process.stdout.write(delta.content);
  }
}

Curl Code Example

# Get your CometAPI key from https://api.cometapi.com/console/token
# Export it as: export COMETAPI_KEY="your-key-here"

curl https://api.cometapi.com/v1/chat/completions \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mimo-v2-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain the Monty Hall problem step by step."}
    ],
    "stream": true,
    "thinking": {"type": "enabled"}
  }'