Qwen2.5 VL 32B Instruct
Alibaba Cloud / Qwen TeamQwenOpen WeightApache 2.0 · Commercial OK
描述
Qwen2.5-VL is a vision-language model from the Qwen family. Key enhancements include visual understanding (objects, text, charts, layouts), visual agent capabilities (tool use, computer/phone control), long video comprehension with event pinpointing, visual localization (bounding boxes/points), and structured output generation.
发布日期
2025-02-28
参数规模
33.5B
上下文长度
—
支持模态
—
能力雷达图
50
general
90
coding
70
reasoning
43
science估算
40
agents
70
multimodal
Science 在缺少专门科学评测时使用推理能力代理估算。
排行榜排名
基准测试分数 (LLM Stats)
Agents
AITZ_EM
83.1%自报
AndroidWorld_SR
22.0%自报
OSWorld
5.9%自报
Biology
GPQA
46.0%自报
Code
HumanEval
91.5%自报
Finance
MMLU
78.4%自报
MMLU-Pro
68.8%自报
General
MBPP
0.84 / 100自报
MMMU
70.0%自报
MMStar
69.5%自报
MMMU-Pro
49.5%自报
Grounding
ScreenSpot
88.5%自报
ScreenSpot Pro
39.4%自报
Image To Text
DocVQA
94.8%自报
OCRBench-V2 (zh)
59.1%自报
OCRBench-V2 (en)
57.2%自报
Language
CharadesSTA
54.2%自报
Long Context
LVBench
49.0%自报
Math
MATH
82.2%自报
MathVista-Mini
74.7%自报
MathVision
38.4%自报
Multimodal
Android Control Low_EM
93.3%自报
InfoVQA
83.4%自报
VideoMME w sub.
77.9%自报
CC-OCR
77.1%自报
VideoMME w/o sub.
70.5%自报
Android Control High_EM
69.6%自报
MMBench-Video
1.9%自报
AA 评测指数
暂无 AA 评测数据
LLM Stats 分类评分
Code90
Structured Output80
Text-to-image80
Finance70
Healthcare70
Image To Text70
Language70
Legal70
Math70
Spatial Reasoning60
Vision60
Grounding60
Multimodal60
Reasoning60
Video50
Biology50
Chemistry50
General50
Long Context50
Physics50
Agents40
定价
暂无定价数据
速度
暂无速度数据
可用提供商
(LS 内部计价单位)暂无提供商数据