Qwen3 VL 235B A22B (Reasoning)

AlibabaQwenOpen WeightApache 2.0 · Commercial OK

Description

Qwen3-VL-235B-A22B-Thinking is the most powerful vision-language model in the Qwen series, featuring 236B parameters with MoE architecture for reasoning-enhanced multimodal understanding. Key capabilities include: Visual Agent (operates PC/mobile GUIs, recognizes elements, invokes tools), Visual Coding (generates Draw.io/HTML/CSS/JS from images/videos), Advanced Spatial Perception (2D grounding and 3D grounding for spatial reasoning and embodied AI), Long Context & Video Understanding (native 256K context expandable to 1M, handles hours-long video with second-level indexing), Enhanced Multimodal Reasoning (excels in STEM/Math with causal analysis), Upgraded Visual Recognition (celebrities, anime, products, landmarks, flora/fauna), and Expanded OCR (32 languages, robust in low light/blur/tilt). Architecture innovations include Interleaved-MRoPE for positional embeddings, DeepStack for multi-level ViT feature fusion, and Text-Timestamp Alignment for precise video temporal modeling.

Release Date

2025-09-23

Parameters

236.0B

Context Length

131K

Modalities

image, text, video

Capability Radar

general

coding

reasoning

scienceest.

agents

100

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Agents & Tools	24	66.0	LS
Code Ranking	171	47.0	AA
General Ranking	146	59.0	AA
Math Reasoning	49	89.0	AA
Multimodal Ranking	64	67.0	LS
Reasoning	37	75.0	LS
Science	137	56.0	AA

Benchmark Scores (LLM Stats)

3d

Objectron

0.71 / 100SR

BLINK

67.1%SR

ARKitScenes

0.54 / 100SR

SUNRGBD

0.35 / 100SR

Hypersim

0.11 / 100SR

Agents

SIFO

0.77 / 100SR

BFCL-v3

71.9%SR

SIFO-Multiturn

0.71 / 100SR

OSWorld-G

0.68 / 100SR

OSWorld

38.1%SR

Chemistry

SuperGPQA

64.3%SR

Code

Design2Code

0.93 / 100SR

Communication

MM-MT-Bench

8.50 / 100SR

WritingBench

86.7%SR

Multi-IF

79.1%SR

Creativity

Creative Writing v3

85.7%SR

Embodied

EmbSpatialBench

0.84 / 100SR

RoboSpatialHome

0.74 / 100SR

Factuality

SimpleQA

44.4%SR

Finance

MMLU

90.6%SR

MMLU-Pro

83.8%SR

MMLU-ProX

80.6%SR

General

MMLU-Redux

93.7%SR

IFEval

88.2%SR

MMMUval

80.6%SR

Include

80.0%SR

LiveBench 20241125

79.6%SR

MMStar

78.7%SR

LiveCodeBench v6

70.1%SR

MMMU-Pro

69.3%SR

SimpleVQA

0.61 / 100SR

Grounding

ScreenSpot

95.4%SR

RefCOCO-avg

0.92 / 100SR

RefSpatialBench

0.70 / 100SR

ScreenSpot Pro

61.8%SR

Healthcare

VideoMMMU

80.0%SR

Image To Text

OCRBench

87.5%SR

OCRBench-V2 (en)

66.8%SR

OCRBench-V2 (zh)

63.5%SR

Instruction Following

MIABench

0.93 / 100SR

Language

CharadesSTA

63.5%SR

Long Context

MLVU

83.8%SR

LVBench

63.6%SR

MMLongBench-Doc

0.56 / 100SR

Math

AIME 2025

89.7%SR

MathVista-Mini

85.8%SR

MathVerse-Mini

0.85 / 100SR

HMMT25

77.4%SR

MathVision

74.6%SR

Humanity's Last Exam

13.6%SR

Multimodal

DocVQAtest

96.5%SR

MMBench-V1.1

90.6%SR

InfoVQAtest

89.5%SR

AI2D

89.2%SR

CC-OCR

81.5%SR

MuirBench

80.1%SR

VideoMME w/o sub.

79.0%SR

CharXiv-R

66.1%SR

VisuLogic

0.34 / 100SR

ZEROBench-Sub

0.28 / 100SR

ZEROBench

0.04 / 100SR

Reasoning

ZebraLogic

97.3%SR

CountBench

0.94 / 100SR

Hallusion Bench

66.7%SR

ERQA

52.5%SR

Spatial Reasoning

RealWorldQA

81.3%SR

Vision

ODinW

43.2%SR

AA Evaluation Indices

Math Index

88.3

Intelligence Index

27.6

Coding Index

20.9

Aime 25

0.9

Mmlu Pro

0.8

Gpqa

0.8

Livecodebench

0.6

Lcr

0.6

Ifbench

0.6

Tau2

0.5

Scicode

0.4

Terminalbench Hard

0.1

Hle

0.1

LLM Stats Category Scores

Communication

Multimodal

100

Writing

Creativity

Structured Output

Text-to-image

Video

Embodied

Finance

Grounding

Healthcare

Instruction Following

Language

Legal

Math

Spatial Reasoning

Tool Calling

Vision

General

Image To Text

Long Context

Reasoning

Agents

Chemistry

Economics

Physics

Factuality

Pricing

Input Price$0.84 / 1M tokens

Output Price$6.175 / 1M tokens

Blended Price (3:1)$2.174 / 1M tokens

Speed

Tokens/sec30.0 tokens/s

Time to First Token1.35s

Time to Answer68.10s

Available Providers

(LS internal units)

Provider	Input Price	Output Price
DeepInfra	450K	3.5M
Novita	980K	4.0M

External Sources

LLM Stats