Qwen3 VL 4B (Reasoning)

AlibabaQwenОткрытые весаApache 2.0 · Коммерческое использование

Описание

Qwen3-VL is a large multimodal model that unifies vision, language, and reasoning to achieve human-level perception and cognition across text, images, and video. Built on a 235B-parameter architecture, it integrates early joint training of visual and textual modalities for strong language grounding. The model supports up to a 1 million-token context window and excels at visual understanding, spatial reasoning, long video comprehension, and tool-based interaction. It can generate code from images, perform precise 2D/3D object grounding, and operate digital interfaces like a visual agent. The “Instruct” version rivals Gemini 2.5 Pro in perception benchmarks, while the “Thinking” version leads in multimodal reasoning and STEM tasks. With multilingual OCR, creative writing, and fine-grained scene interpretation, Qwen3-VL establishes a new open-source frontier for integrated vision-language intelligence.

Дата выхода

2025-10-14

Параметры

4.0B

Длина контекста

—

Модальности

image, text

Радар способностей

general

coding

reasoning

scienceоцен.

agents

100

multimodal

Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.

Рейтинги

Домен	#Место	Оценка	Источник
Агентные возможности	74	53.0	LS
Рейтинг кодинга	351	22.0	AA
Общий рейтинг	375	30.0	AA
Математическое мышление	285	26.0	AA
Мультимодальный рейтинг	56	74.0	LS
Рассуждения	82	54.0	LS
Наука	402	27.0	AA

Оценки бенчмарков (LLM Stats)

3d

BLINK

63.4%Сам.

Agents

BFCL-v3

67.3%Сам.

OSWorld

31.4%Сам.

Biology

GPQA

64.1%Сам.

Chemistry

SuperGPQA

46.8%Сам.

Communication

MM-MT-Bench

7.70 / 100Сам.

WritingBench

84.0%Сам.

Multi-IF

73.6%Сам.

Creativity

Creative Writing v3

76.1%Сам.

Arena-Hard v2

36.8%Сам.

Finance

MMLU

81.5%Сам.

MMLU-Pro

73.6%Сам.

MMLU-ProX

65.0%Сам.

General

MMLU-Redux

86.0%Сам.

IFEval

82.6%Сам.

MLVU-M

75.7%Сам.

MMStar

73.2%Сам.

MMMU (val)

70.8%Сам.

LiveBench 20241125

68.4%Сам.

Include

64.6%Сам.

MMMU-Pro

57.0%Сам.

LiveCodeBench v6

51.3%Сам.

Grounding

ScreenSpot

92.9%Сам.

ScreenSpot Pro

49.2%Сам.

Healthcare

VideoMMMU

69.4%Сам.

Image To Text

OCRBench

80.8%Сам.

OCRBench-V2 (en)

61.8%Сам.

OCRBench-V2 (zh)

55.8%Сам.

Language

CharadesSTA

59.0%Сам.

Long Context

LVBench

53.5%Сам.

Math

MathVista-Mini

79.5%Сам.

AIME 2025

74.5%Сам.

MathVision

60.0%Сам.

HMMT25

53.1%Сам.

PolyMATH

44.6%Сам.

Multimodal

DocVQAtest

94.2%Сам.

MMBench-V1.1

86.7%Сам.

AI2D

84.9%Сам.

CharXiv-D

83.9%Сам.

InfoVQAtest

83.0%Сам.

MuirBench

75.0%Сам.

CC-OCR

73.8%Сам.

MVBench

69.3%Сам.

CharXiv-R

50.3%Сам.

Reasoning

Hallusion Bench

64.1%Сам.

ERQA

47.3%Сам.

Spatial Reasoning

RealWorldQA

73.2%Сам.

Vision

ODinW

39.4%Сам.

Индексы оценки AA

Math Index

25.7

Intelligence Index

7.9

Mmlu Pro

0.7

Gpqa

0.5

Ifbench

0.4

Livecodebench

0.3

Aime 25

0.3

Lcr

0.2

Scicode

0.2

Tau2

0.2

Hle

0.0

Terminalbench Hard

0.0

Оценки категорий LLM Stats

Communication

Multimodal

100

Instruction Following

Structured Output

Image To Text

Language

Legal

Math

Reasoning

Finance

Grounding

Healthcare

Creativity

Text-to-image

Tool Calling

Vision

Writing

Physics

Spatial Reasoning

General

Biology

Chemistry

Video

Long Context

Agents

Economics

Цены

Цена вводаБесплатно

Цена выводаБесплатно

Смешанная цена (3:1)Бесплатно

Скорость

Токенов/сек0.0

Задержка первого токена0.00s

Время до первого ответа0.00s

Рейтинг цен провайдеров

1 провайдеров

ПровайдерВводВывод

1DeepInfra

Сравнение цен разных API-провайдеров для этой модели.

Внешние ссылки

LLM Stats Artificial Analysis