MiMo-V2-Omni
XiaomiProprietary
描述
MiMo-V2-Omni is Xiaomi's omni foundation model uniting frontier multimodal understanding with strong agentic capability. It fuses dedicated image, video, and audio encoders into a single shared backbone, processing all modalities simultaneously. Natively supports structured tool calling, function execution, and UI grounding. Supports over 10 hours of continuous audio understanding and 256K token context window.
發布日期
2026-03-19
參數規模
—
上下文長度
262K
支援模態
audio, image, text, video
能力雷達圖
38
general
36
coding
83
reasoning
54
science估算
100
agents
85
multimodal
Science 在缺少專門科學評測時使用推理能力代理估算。
排行榜排名
基準測試分數 (LLM Stats)
Agents
GDPval-AA
1410.00 / 3000自報
PinchBench
81.2%自報
Claw-Eval
54.8%自報
MM-BrowserComp
52.0%自報
OmniGAIA
49.8%自報
Code
SWE-Bench Verified
74.8%自報
AA 評測指數
Intelligence Index43.4
Coding Index35.5
Tau20.9
Gpqa0.8
Lcr0.7
Ifbench0.5
Scicode0.4
Terminalbench Hard0.3
Hle0.2
LLM Stats 分類評分
Finance100
General100
Legal100
Reasoning100
Agents100
Code70
Coding70
Frontend Development70
定價
輸入價格免費
輸出價格免費
混合價格(3:1)免費
速度
Tokens/秒120.9 tokens/s
首Token延遲1.35s
首回答延遲17.89s
可用提供商
(LS 內部計價單位)| 提供商 | 輸入價格 | 輸出價格 |
|---|---|---|
| Xiaomi | 400K | 2.0M |