MiMo-V2-Omni

XiaomiProprietary

Description

MiMo-V2-Omni is Xiaomi's omni foundation model uniting frontier multimodal understanding with strong agentic capability. It fuses dedicated image, video, and audio encoders into a single shared backbone, processing all modalities simultaneously. Natively supports structured tool calling, function execution, and UI grounding. Supports over 10 hours of continuous audio understanding and 256K token context window.

Release Date

2026-03-19

Parameters

—

Context Length

262K

Modalities

audio, image, pdf, text, video

Capability Radar

general

coding

reasoning

scienceest.

100

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Agentic Capability	66	54.0	LS
Code Ranking	73	72.0	AA
General Ranking	91	67.0	AA
Science	101	61.0	AA

Benchmark Scores (LLM Stats)

Agents

GDPval-AA

1410.00 / 3000SR

PinchBench

81.2%SR

Claw-Eval

54.8%SR

MM-BrowserComp

52.0%SR

OmniGAIA

49.8%SR

Code

SWE-Bench Verified

74.8%SR

AA Evaluation Indices

Intelligence Index

35.0

Tau2

0.9

Gpqa

0.8

Lcr

0.7

Ifbench

0.5

Scicode

0.4

Terminalbench Hard

0.3

Hle

0.2

LLM Stats Category Scores

Legal

100

Finance

100

General

100

Reasoning

100

Agents

100

Frontend Development

Code

Coding

Pricing

Input PriceFree

Output PriceFree

Blended Price (3:1)Free

Cache Read Price$0.08 / 1M tokens

Speed

Tokens/sec70.9

Time to First Token2.79s

Time to Answer31.00s

Provider Price Ranking

6 providers

Cheapest: NanoGPTMost Expensive: Xiaomi

ProviderInputOutput

1NanoGPTCheapest

$0.4

2OpenCode Go

$0.4

3ZenMux

$0.4

4Kilo Gateway

$0.4

5LLM Gateway

$0.4

6Xiaomi

$0.4

Compare pricing across different API providers for this model.

External Sources

LLM Stats Artificial Analysis