Qwen2.5 VL 32B Instruct

Alibaba Cloud / Qwen TeamQwen开源权重Apache 2.0 · 商用许可

描述

Qwen2.5-VL is a vision-language model from the Qwen family. Key enhancements include visual understanding (objects, text, charts, layouts), visual agent capabilities (tool use, computer/phone control), long video comprehension with event pinpointing, visual localization (bounding boxes/points), and structured output generation.

发布日期

2025-02-28

参数规模

33.5B

上下文长度

—

支持模态

—

能力雷达图

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域	#排名	分数	来源
智能体能力模型榜	115	33.0	LS
多模态榜	74	66.0	LS

基准测试分数 (LLM Stats)

Agents

AITZ_EM

83.1%自报

AndroidWorld_SR

22.0%自报

OSWorld

5.9%自报

Biology

GPQA

46.0%自报

Code

HumanEval

91.5%自报

Finance

MMLU

78.4%自报

MMLU-Pro

68.8%自报

General

MBPP

0.84 / 100自报

MMMU

70.0%自报

MMStar

69.5%自报

MMMU-Pro

49.5%自报

Grounding

ScreenSpot

88.5%自报

ScreenSpot Pro

39.4%自报

Image To Text

DocVQA

94.8%自报

OCRBench-V2 (zh)

59.1%自报

OCRBench-V2 (en)

57.2%自报

Language

CharadesSTA

54.2%自报

Long Context

LVBench

49.0%自报

Math

MATH

82.2%自报

MathVista-Mini

74.7%自报

MathVision

38.4%自报

Multimodal

Android Control Low_EM

93.3%自报

InfoVQA

83.4%自报

VideoMME w sub.

77.9%自报

CC-OCR

77.1%自报

VideoMME w/o sub.

70.5%自报

Android Control High_EM

69.6%自报

MMBench-Video

1.9%自报

AA 评测指数

暂无 AA 评测数据

LLM Stats 分类评分

Code

Structured Output

Text-to-image

Image To Text

Language

Legal

Math

Finance

Healthcare

Multimodal

Reasoning

Spatial Reasoning

Grounding

Vision

Long Context

Physics

General

Biology

Chemistry

Video

Agents

定价

暂无定价数据

速度

暂无速度数据

供应商价格排行

6 个供应商

最便宜: IO.NET最贵: LLM Gateway

供应商输入输出

1IO.NET最便宜

$0.05

$0.22

2Chutes

$0.0543

$0.2174

3Meganova

$0.2

$0.6

4SiliconFlow (China)

$0.27

5SiliconFlow

$0.27

6LLM Gateway

$1.4

$4.2

比较该模型在不同 API 供应商之间的定价。

外部链接

LLM Stats Artificial Analysis