Passer au contenu principal

Qwen2.5-Omni-7B

Alibaba Cloud / Qwen TeamQwenOpen WeightApache 2.0 · Commercial OK

Description

Qwen2.5-Omni is the flagship end-to-end multimodal model in the Qwen series. It processes diverse inputs including text, images, audio, and video, delivering real-time streaming responses through text generation and natural speech synthesis using a novel Thinker-Talker architecture.

Date de sortie
2025-03-27
Paramètres
7.0B
Longueur du contexte
Modalités

Radar de capacités

50
general
80
coding
60
reasoning
26
scienceest.
0
agents
90
multimodal

Science utilise un proxy de raisonnement lorsque les benchmarks scientifiques dédiés ne sont pas disponibles.

Classements

Domaine#RangScoreSource
Multimodal Ranking52
74.0
LS

Scores de benchmarks (LLM Stats)

Audio

VocalSound93.9%Aut.
GiantSteps Tempo88.0%Aut.
MMAU Music69.2%Aut.
MMAU Sound67.9%Aut.
MMAU65.6%Aut.
MMAU Speech59.8%Aut.
OmniBench Music52.8%Aut.
CoVoST2 en-zh0.41 / 100Aut.
MusicCaps32.8%Aut.
Common Voice 150.08 / 100Aut.

Biology

GPQA30.8%Aut.

Code

HumanEval78.7%Aut.

Communication

VoiceBench Avg74.1%Aut.
MM-MT-Bench0.06 / 100Aut.

Creativity

Meld57.0%Aut.

Finance

MMLU-Pro47.0%Aut.

General

MBPP0.73 / 100Aut.
MMLU-Redux71.0%Aut.
MultiPL-E65.8%Aut.
MMStar64.0%Aut.
MME-RealWorld61.6%Aut.
MMMU59.2%Aut.
MMMU-Pro36.6%Aut.
LiveBench29.6%Aut.
NMOS0.05 / 100Aut.

Grounding

PointGrounding66.5%Aut.

Healthcare

CRPErelation76.5%Aut.

Image To Text

DocVQA95.2%Aut.
TextVQA84.4%Aut.
OCRBench_V257.8%Aut.

Language

FLEURS0.04 / 100Aut.

Long Context

EgoSchema68.6%Aut.

Math

GSM8k88.7%Aut.
MATH71.5%Aut.
MathVista67.9%Aut.
MathVision25.0%Aut.

Multimodal

ChartQA85.3%Aut.
AI2D83.2%Aut.
MMBench-V1.181.8%Aut.
VideoMME w sub.72.4%Aut.
MVBench70.3%Aut.
MuirBench59.2%Aut.
OmniBench56.1%Aut.

Spatial Reasoning

RealWorldQA70.3%Aut.

Vision

ODinW42.4%Aut.

Indices d'évaluation AA

Aucune donnée d'évaluation AA disponible

Scores par catégorie LLM Stats

Image To Text
90
Code
80
Spatial Reasoning
70
Video
70
Vision
70
Long Context
70
Math
60
Multimodal
60
Reasoning
60
Finance
50
General
50
Healthcare
50
Language
50
Legal
50
Biology
30
Chemistry
30
Physics
30
Communication
10
Speech To Text
0

Tarification

Aucune donnée de prix disponible

Vitesse

Aucune donnée de vitesse disponible

Fournisseurs disponibles

(Unités internes LS)

Aucune donnée de fournisseur disponible

Sources externes