Qwen2.5 VL 32B Instruct

Alibaba Cloud / Qwen TeamQwen開源權重Apache 2.0 · 商用許可

描述

Qwen2.5-VL is a vision-language model from the Qwen family. Key enhancements include visual understanding (objects, text, charts, layouts), visual agent capabilities (tool use, computer/phone control), long video comprehension with event pinpointing, visual localization (bounding boxes/points), and structured output generation.

發布日期

2025-02-28

參數規模

33.5B

上下文長度

—

支援模態

—

能力雷達圖

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
智慧體能力模型榜	115	33.0	LS
多模態榜	74	66.0	LS

基準測試分數 (LLM Stats)

Agents

AITZ_EM

83.1%自報

AndroidWorld_SR

22.0%自報

OSWorld

5.9%自報

Biology

GPQA

46.0%自報

Code

HumanEval

91.5%自報

Finance

MMLU

78.4%自報

MMLU-Pro

68.8%自報

General

MBPP

0.84 / 100自報

MMMU

70.0%自報

MMStar

69.5%自報

MMMU-Pro

49.5%自報

Grounding

ScreenSpot

88.5%自報

ScreenSpot Pro

39.4%自報

Image To Text

DocVQA

94.8%自報

OCRBench-V2 (zh)

59.1%自報

OCRBench-V2 (en)

57.2%自報

Language

CharadesSTA

54.2%自報

Long Context

LVBench

49.0%自報

Math

MATH

82.2%自報

MathVista-Mini

74.7%自報

MathVision

38.4%自報

Multimodal

Android Control Low_EM

93.3%自報

InfoVQA

83.4%自報

VideoMME w sub.

77.9%自報

CC-OCR

77.1%自報

VideoMME w/o sub.

70.5%自報

Android Control High_EM

69.6%自報

MMBench-Video

1.9%自報

AA 評測指數

暫無 AA 評測資料

LLM Stats 分類評分

Code

Structured Output

Text-to-image

Image To Text

Language

Legal

Math

Finance

Healthcare

Multimodal

Reasoning

Spatial Reasoning

Grounding

Vision

Long Context

Physics

General

Biology

Chemistry

Video

Agents

定價

暫無定價資料

速度

暫無速度資料

供應商價格排行

6 個供應商

最便宜: IO.NET最貴: LLM Gateway

供應商輸入輸出

1IO.NET最便宜

$0.05

$0.22

2Chutes

$0.0543

$0.2174

3Meganova

$0.2

$0.6

4SiliconFlow (China)

$0.27

5SiliconFlow

$0.27

6LLM Gateway

$1.4

$4.2

比較該模型在不同 API 供應商之間的定價。

外部連結

LLM Stats Artificial Analysis