Qwen3 VL 235B A22B (Reasoning)

AlibabaQwenОткрытые весаApache 2.0 · Коммерческое использование

Описание

Qwen3-VL-235B-A22B-Thinking is the most powerful vision-language model in the Qwen series, featuring 236B parameters with MoE architecture for reasoning-enhanced multimodal understanding. Key capabilities include: Visual Agent (operates PC/mobile GUIs, recognizes elements, invokes tools), Visual Coding (generates Draw.io/HTML/CSS/JS from images/videos), Advanced Spatial Perception (2D grounding and 3D grounding for spatial reasoning and embodied AI), Long Context & Video Understanding (native 256K context expandable to 1M, handles hours-long video with second-level indexing), Enhanced Multimodal Reasoning (excels in STEM/Math with causal analysis), Upgraded Visual Recognition (celebrities, anime, products, landmarks, flora/fauna), and Expanded OCR (32 languages, robust in low light/blur/tilt). Architecture innovations include Interleaved-MRoPE for positional embeddings, DeepStack for multi-level ViT feature fusion, and Text-Timestamp Alignment for precise video temporal modeling.

Дата выхода

2025-09-23

Параметры

236.0B

Длина контекста

131K

Модальности

image, text, video

Радар способностей

general

coding

reasoning

scienceоцен.

agents

100

multimodal

Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.

Рейтинги

Домен	#Место	Оценка	Источник
Агентные возможности	19	66.0	LS
Рейтинг кодинга	158	55.0	AA
Общий рейтинг	165	55.0	AA
Математическое мышление	49	89.0	AA
Мультимодальный рейтинг	73	67.0	LS
Рассуждения	40	75.0	LS
Наука	155	54.0	AA

Оценки бенчмарков (LLM Stats)

3d

Objectron

0.71 / 100Сам.

BLINK

67.1%Сам.

ARKitScenes

0.54 / 100Сам.

SUNRGBD

0.35 / 100Сам.

Hypersim

0.11 / 100Сам.

Agents

SIFO

0.77 / 100Сам.

BFCL-v3

71.9%Сам.

SIFO-Multiturn

0.71 / 100Сам.

OSWorld-G

0.68 / 100Сам.

OSWorld

38.1%Сам.

Chemistry

SuperGPQA

64.3%Сам.

Code

Design2Code

0.93 / 100Сам.

Communication

MM-MT-Bench

8.50 / 100Сам.

WritingBench

86.7%Сам.

Multi-IF

79.1%Сам.

Creativity

Creative Writing v3

85.7%Сам.

Embodied

EmbSpatialBench

0.84 / 100Сам.

RoboSpatialHome

0.74 / 100Сам.

Factuality

SimpleQA

44.4%Сам.

Finance

MMLU

90.6%Сам.

MMLU-Pro

83.8%Сам.

MMLU-ProX

80.6%Сам.

General

MMLU-Redux

93.7%Сам.

IFEval

88.2%Сам.

MMMUval

80.6%Сам.

Include

80.0%Сам.

LiveBench 20241125

79.6%Сам.

MMStar

78.7%Сам.

LiveCodeBench v6

70.1%Сам.

MMMU-Pro

69.3%Сам.

SimpleVQA

0.61 / 100Сам.

Grounding

ScreenSpot

95.4%Сам.

RefCOCO-avg

0.92 / 100Сам.

RefSpatialBench

0.70 / 100Сам.

ScreenSpot Pro

61.8%Сам.

Healthcare

VideoMMMU

80.0%Сам.

Image To Text

OCRBench

87.5%Сам.

OCRBench-V2 (en)

66.8%Сам.

OCRBench-V2 (zh)

63.5%Сам.

Instruction Following

MIABench

0.93 / 100Сам.

Language

CharadesSTA

63.5%Сам.

Long Context

MLVU

83.8%Сам.

LVBench

63.6%Сам.

MMLongBench-Doc

0.56 / 100Сам.

Math

AIME 2025

89.7%Сам.

MathVista-Mini

85.8%Сам.

MathVerse-Mini

0.85 / 100Сам.

HMMT25

77.4%Сам.

MathVision

74.6%Сам.

Humanity's Last Exam

13.6%Сам.

Multimodal

DocVQAtest

96.5%Сам.

MMBench-V1.1

90.6%Сам.

InfoVQAtest

89.5%Сам.

AI2D

89.2%Сам.

CC-OCR

81.5%Сам.

MuirBench

80.1%Сам.

VideoMME w/o sub.

79.0%Сам.

CharXiv-R

66.1%Сам.

VisuLogic

0.34 / 100Сам.

ZEROBench-Sub

0.28 / 100Сам.

ZEROBench

0.04 / 100Сам.

Reasoning

ZebraLogic

97.3%Сам.

CountBench

0.94 / 100Сам.

Hallusion Bench

66.7%Сам.

ERQA

52.5%Сам.

Spatial Reasoning

RealWorldQA

81.3%Сам.

Vision

ODinW

43.2%Сам.

Индексы оценки AA

Math Index

88.3

Intelligence Index

20.6

Aime 25

0.9

Mmlu Pro

0.8

Gpqa

0.8

Livecodebench

0.6

Lcr

0.6

Ifbench

0.6

Tau2

0.5

Scicode

0.4

Terminalbench Hard

0.1

Hle

0.1

Оценки категорий LLM Stats

Communication

Multimodal

100

Creativity

Writing

Instruction Following

Language

Legal

Math

Structured Output

Embodied

Finance

Grounding

Healthcare

Text-to-image

Video

Image To Text

Long Context

Reasoning

Spatial Reasoning

General

Tool Calling

Vision

Physics

Agents

Chemistry

Economics

Factuality

Цены

Цена ввода$0.84 / 1M токенов

Цена вывода$6.175 / 1M токенов

Смешанная цена (3:1)$2.174 / 1M токенов

Скорость

Токенов/сек57.2

Задержка первого токена1.16s

Время до первого ответа36.11s

Рейтинг цен провайдеров

10 провайдеров

Самый дешевый: Venice AIСамый дорогой: NovitaAI

ПровайдерВводВывод

1Venice AIСамый дешевый

$0.25

$1.5

2OpenRouter

$0.26

$2.6

3Kilo Gateway

$0.26

$2.6

4Alibaba (China)

$0.28671

$1.14682

5SiliconFlow (China)

$0.45

$3.5

6SiliconFlow

$0.45

$3.5

7NanoGPT

$0.5

8LLM Gateway

$0.5

9Alibaba

$0.7

$2.8

10NovitaAI

$0.98

$3.95

Сравнение цен разных API-провайдеров для этой модели.

Внешние ссылки

LLM Stats Artificial Analysis