Qwen2.5 VL 32B Instruct

Alibaba Cloud / Qwen TeamQwenOpen WeightApache 2.0 · Uso Comercial

Descripción

Qwen2.5-VL is a vision-language model from the Qwen family. Key enhancements include visual understanding (objects, text, charts, layouts), visual agent capabilities (tool use, computer/phone control), long video comprehension with event pinpointing, visual localization (bounding boxes/points), and structured output generation.

Fecha de lanzamiento

2025-02-28

Parámetros

33.5B

Longitud del contexto

—

Modalidades

—

Radar de capacidades

general

coding

reasoning

scienceest.

agents

multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio	#Posición	Puntuación	Fuente
Capacidad agéntica	115	33.0	LS
Ranking multimodal	74	66.0	LS

Puntuaciones de benchmarks (LLM Stats)

Agents

AITZ_EM

83.1%Aut.

AndroidWorld_SR

22.0%Aut.

OSWorld

5.9%Aut.

Biology

GPQA

46.0%Aut.

Code

HumanEval

91.5%Aut.

Finance

MMLU

78.4%Aut.

MMLU-Pro

68.8%Aut.

General

MBPP

0.84 / 100Aut.

MMMU

70.0%Aut.

MMStar

69.5%Aut.

MMMU-Pro

49.5%Aut.

Grounding

ScreenSpot

88.5%Aut.

ScreenSpot Pro

39.4%Aut.

Image To Text

DocVQA

94.8%Aut.

OCRBench-V2 (zh)

59.1%Aut.

OCRBench-V2 (en)

57.2%Aut.

Language

CharadesSTA

54.2%Aut.

Long Context

LVBench

49.0%Aut.

Math

MATH

82.2%Aut.

MathVista-Mini

74.7%Aut.

MathVision

38.4%Aut.

Multimodal

Android Control Low_EM

93.3%Aut.

InfoVQA

83.4%Aut.

VideoMME w sub.

77.9%Aut.

CC-OCR

77.1%Aut.

VideoMME w/o sub.

70.5%Aut.

Android Control High_EM

69.6%Aut.

MMBench-Video

1.9%Aut.

Índices de evaluación AA

No hay datos de evaluación AA disponibles

Puntuaciones por categoría LLM Stats

Code

Structured Output

Text-to-image

Image To Text

Language

Legal

Math

Finance

Healthcare

Multimodal

Reasoning

Spatial Reasoning

Grounding

Vision

Long Context

Physics

General

Biology

Chemistry

Video

Agents

Precios

No hay datos de precios disponibles

Velocidad

No hay datos de velocidad disponibles

Ranking de Precios por Proveedor

6 proveedores

Más barato: IO.NETMás caro: LLM Gateway

ProveedorEntradaSalida

1IO.NETMás barato

$0.05

$0.22

2Chutes

$0.0543

$0.2174

3Meganova

$0.2

$0.6

4SiliconFlow (China)

$0.27

5SiliconFlow

$0.27

6LLM Gateway

$1.4

$4.2

Comparar precios entre diferentes proveedores de API para este modelo.

Fuentes externas

LLM Stats Artificial Analysis