Skip to main content

Qwen2.5 VL 72B Instruct

Alibaba Cloud / Qwen TeamQwenOpen Weighttongyi-qianwen

Description

Qwen2.5-VL is the new flagship vision-language model of Qwen, significantly improved from Qwen2-VL. It excels at recognizing objects, analyzing text/charts/layouts in images, acting as a visual agent, understanding long videos (over 1 hour) with event pinpointing, performing visual localization (bounding boxes/points), and generating structured outputs from documents.

Release Date
2025-01-26
Parameters
72.0B
Context Length
131K
Modalities
image, text

Capability Radar

50
general
0
coding
60
reasoning
60
scienceest.
40
agents
80
multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain#RankScoreSource
Agentic Capability98
45.0
LS
Multimodal Ranking59
73.0
LS
Reasoning79
55.0
LS

Benchmark Scores (LLM Stats)

Agents

AITZ_EM83.2%SR
MobileMiniWob++_SR68.0%SR
AndroidWorld_SR35.0%SR
OSWorld8.8%SR

General

MMVet76.2%SR
MLVU-M74.6%SR
MMStar70.8%SR
MMMU70.2%SR
MMMU-Pro51.1%SR

Grounding

ScreenSpot87.1%SR
ScreenSpot Pro43.6%SR

Image To Text

DocVQA96.4%SR
OCRBench88.5%SR
OCRBench-V2 (en)61.5%SR

Long Context

EgoSchema76.2%SR
LVBench47.3%SR

Math

MathVista-Mini74.8%SR
MathVision38.1%SR

Multimodal

Android Control Low_EM93.7%SR
ChartQA89.5%SR
AI2D88.4%SR
MMBench88.0%SR
CC-OCR79.8%SR
TempCompass74.8%SR
VideoMME w/o sub.73.3%SR
PerceptionTest73.2%SR
MVBench70.4%SR
Android Control High_EM67.4%SR
MMBench-Video2.0%SR

Reasoning

Hallusion Bench55.2%SR

AA Evaluation Indices

No AA evaluation data available

LLM Stats Category Scores

Image To Text
80
Structured Output
80
Text-to-image
80
Reasoning
70
Spatial Reasoning
70
Grounding
70
Healthcare
70
Long Context
60
Math
60
Multimodal
60
Vision
60
General
50
Video
50
Agents
40

Pricing

Input Price$2.8 / 1M tokens
Output Price$8.4 / 1M tokens
Blended Price (3:1)$4.2 / 1M tokens

Speed

No speed data available

Provider Price Ranking

Provider Price Ranking

12 providers

Cheapest: Nebius Token FactoryMost Expensive: LLM Gateway
ProviderInputOutput
1Nebius Token FactoryCheapest
$0.25
$0.75
2SiliconFlow (China)
$0.59
$0.59
3SiliconFlow
$0.59
$0.59
4NanoGPT
$0.69989
$0.69989
5OpenRouter
$0.8
$1
6NovitaAI
$0.8
$0.8
7Kilo Gateway
$0.8
$0.8
8OVHcloud AI Endpoints
$1.01
$1.01
9Alibaba (China)
$2.294
$6.881
10Alibaba Cloud / Qwen TeamPRIMARY
$2.8
$8.4
11Alibaba
$2.8
$8.4
12LLM Gateway
$2.8
$8.4

Compare pricing across different API providers for this model.

External Sources