Nova 2 Omni
AmazonAmazonProprietary
Description
Amazon Nova 2 Omni is Amazon's first unified multimodal reasoning model that processes text, documents, images, video, and audio inputs and generates both text and images from a single model, eliminating multi-model coordination complexity. It delivers strong multimodal perception, core reasoning, agentic tool use, and high-quality image generation and editing, with configurable extended thinking. It supports a 1M token context window, 200+ languages for text, and 10 languages for speech input.
Release Date
2025-12-02
Parameters
—
Context Length
—
Modalities
—
Capability Radar
70
general
0
coding
90
reasoning
68
scienceest.
70
agents
80
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Agentic Capability | 52 | 58.0 | LS |
| Multimodal Ranking | 58 | 73.0 | LS |
Benchmark Scores (LLM Stats)
Agents
BFCL-V4
58.3%SR
Audio
MMAU
75.3%SR
MAVERIX
66.6%SR
CoVoST2
40.7%SR
Communication
Tau2 Telecom
80.0%SR
Tau2 Retail
78.3%SR
Multi-Challenge
75.5%SR
Tau2 Airline
68.8%SR
Document Understanding
RealKIE-FCC
59.8%SR
Finance
MMLU-Pro
80.7%SR
General
IFBench
68.7%SR
MMMU-Pro
61.4%SR
Grounding
RefCOCOg
86.3%SR
ScreenSpot
85.4%SR
Image To Text
OCRBench_V2
58.2%SR
Math
AIME 2025
92.1%SR
Multimodal
Video-MME
77.9%SR
QVHighlights
76.7%SR
AA Evaluation Indices
No AA evaluation data available
LLM Stats Category Scores
Spatial Reasoning90
Grounding90
Math90
Video80
Finance80
Healthcare80
Legal80
Reasoning80
Communication80
Tool Calling70
Vision70
General70
Instruction Following70
Multimodal70
Document Understanding60
Image To Text60
Language60
Agents60
Speech To Text40
Audio40
Pricing
No pricing data available
Speed
No speed data available
Provider Price Ranking
No provider data available