Nova 2 Omni

AmazonAmazonProprietary

Description

Amazon Nova 2 Omni is Amazon's first unified multimodal reasoning model that processes text, documents, images, video, and audio inputs and generates both text and images from a single model, eliminating multi-model coordination complexity. It delivers strong multimodal perception, core reasoning, agentic tool use, and high-quality image generation and editing, with configurable extended thinking. It supports a 1M token context window, 200+ languages for text, and 10 languages for speech input.

Release Date

2025-12-02

Parameters

—

Context Length

—

Modalities

—

Capability Radar

general

coding

reasoning

scienceest.

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Agentic Capability	50	58.0	LS
Multimodal Ranking	61	73.0	LS

Benchmark Scores (LLM Stats)

Agents

BFCL-V4

58.3%SR

Audio

MMAU

75.3%SR

MAVERIX

66.6%SR

CoVoST2

40.7%SR

Communication

Tau2 Telecom

80.0%SR

Tau2 Retail

78.3%SR

Multi-Challenge

75.5%SR

Tau2 Airline

68.8%SR

Document Understanding

RealKIE-FCC

59.8%SR

Finance

MMLU-Pro

80.7%SR

General

IFBench

68.7%SR

MMMU-Pro

61.4%SR

Grounding

RefCOCOg

86.3%SR

ScreenSpot

85.4%SR

Image To Text

OCRBench_V2

58.2%SR

Math

AIME 2025

92.1%SR

Multimodal

Video-MME

77.9%SR

QVHighlights

76.7%SR

AA Evaluation Indices

No AA evaluation data available

LLM Stats Category Scores

Math

Spatial Reasoning

Grounding

Reasoning

Legal

Finance

Healthcare

Communication

Video

Multimodal

Instruction Following

General

Tool Calling

Vision

Image To Text

Language

Document Understanding

Agents

Speech To Text

Audio

Pricing

No pricing data available

Speed

No speed data available

Provider Price Ranking

No provider data available

External Sources

LLM Stats Artificial Analysis