Qwen3 VL 32B Instruct

AlibabaQwenОткрытые весаApache 2.0 · Коммерческое использование

Описание

Qwen3-VL is a large multimodal model that unifies vision, language, and reasoning to achieve human-level perception and cognition across text, images, and video. Built on a 235B-parameter architecture, it integrates early joint training of visual and textual modalities for strong language grounding. The model supports up to a 1 million-token context window and excels at visual understanding, spatial reasoning, long video comprehension, and tool-based interaction. It can generate code from images, perform precise 2D/3D object grounding, and operate digital interfaces like a visual agent. The “Instruct” version rivals Gemini 2.5 Pro in perception benchmarks, while the “Thinking” version leads in multimodal reasoning and STEM tasks. With multilingual OCR, creative writing, and fine-grained scene interpretation, Qwen3-VL establishes a new open-source frontier for integrated vision-language intelligence.

Дата выхода

2025-10-21

Параметры

33.0B

Длина контекста

131K

Модальности

image, text

Радар способностей

general

coding

reasoning

scienceоцен.

agents

multimodal

Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.

Рейтинги

Домен	#Место	Оценка	Источник
Агентные возможности	61	55.0	LS
Рейтинг кодинга	253	37.0	AA
Общий рейтинг	286	38.0	AA
Математическое мышление	124	69.0	AA
Мультимодальный рейтинг	39	78.0	LS
Рассуждения	78	55.0	LS
Наука	261	43.0	AA

Оценки бенчмарков (LLM Stats)

3d

BLINK

67.3%Сам.

Agents

BFCL-v3

70.2%Сам.

OSWorld

32.6%Сам.

Biology

GPQA

68.9%Сам.

Chemistry

SuperGPQA

54.6%Сам.

Communication

MM-MT-Bench

8.40 / 100Сам.

WritingBench

82.9%Сам.

Multi-IF

72.0%Сам.

Creativity

Creative Writing v3

85.6%Сам.

Arena-Hard v2

64.7%Сам.

Finance

MMLU

86.4%Сам.

MMLU-Pro

78.6%Сам.

MMLU-ProX

73.4%Сам.

General

MMLU-Redux

89.8%Сам.

IFEval

84.7%Сам.

MLVU-M

82.1%Сам.

MMStar

77.7%Сам.

MMMU (val)

76.0%Сам.

Include

74.0%Сам.

LiveBench 20241125

72.2%Сам.

MMMU-Pro

65.3%Сам.

LiveCodeBench v6

43.8%Сам.

Grounding

ScreenSpot

95.8%Сам.

ScreenSpot Pro

57.9%Сам.

Image To Text

OCRBench

89.5%Сам.

OCRBench-V2 (en)

67.4%Сам.

OCRBench-V2 (zh)

59.2%Сам.

Language

CharadesSTA

61.2%Сам.

Long Context

LVBench

63.8%Сам.

Math

MathVista-Mini

83.8%Сам.

AIME 2025

66.2%Сам.

MathVision

63.4%Сам.

PolyMATH

40.5%Сам.

Multimodal

DocVQAtest

96.9%Сам.

CharXiv-D

90.5%Сам.

AI2D

89.5%Сам.

InfoVQAtest

87.0%Сам.

CC-OCR

80.3%Сам.

MVBench

72.8%Сам.

MuirBench

72.8%Сам.

CharXiv-R

62.8%Сам.

Reasoning

Hallusion Bench

63.8%Сам.

ERQA

48.8%Сам.

Spatial Reasoning

RealWorldQA

79.0%Сам.

Vision

ODinW

46.6%Сам.

Индексы оценки AA

Math Index

68.3

Intelligence Index

11.1

Mmlu Pro

0.8

Aime 25

0.7

Gpqa

0.7

Livecodebench

0.5

Ifbench

0.4

Lcr

0.3

Scicode

0.3

Tau2

0.3

Terminalbench Hard

0.1

Hle

0.1

Оценки категорий LLM Stats

Communication

Multimodal

Instruction Following

Language

Legal

Structured Output

Grounding

Creativity

Text-to-image

Writing

Image To Text

Math

Reasoning

Spatial Reasoning

Finance

General

Healthcare

Biology

Tool Calling

Video

Vision

Long Context

Physics

Chemistry

Agents

Economics

Цены

Цена ввода$0.7 / 1M токенов

Цена вывода$2.8 / 1M токенов

Смешанная цена (3:1)$1.225 / 1M токенов

Скорость

Токенов/сек70.9

Задержка первого токена1.15s

Время до первого ответа1.15s

Рейтинг цен провайдеров

4 провайдеров

Самый дешевый: SiliconFlow (China)Самый дорогой: Alibaba

ПровайдерВводВывод

1SiliconFlow (China)Самый дешевый

$0.2

$0.6

2SiliconFlow

$0.2

$0.6

3Vercel AI Gateway

$0.4

4AlibabaОсновной

$0.7

$2.8

Сравнение цен разных API-провайдеров для этой модели.

Внешние ссылки

LLM Stats Artificial Analysis