Qwen3 VL 235B A22B (Reasoning)

AlibabaQwen开源权重Apache 2.0 · 商用许可

描述

Qwen3-VL-235B-A22B-Thinking is the most powerful vision-language model in the Qwen series, featuring 236B parameters with MoE architecture for reasoning-enhanced multimodal understanding. Key capabilities include: Visual Agent (operates PC/mobile GUIs, recognizes elements, invokes tools), Visual Coding (generates Draw.io/HTML/CSS/JS from images/videos), Advanced Spatial Perception (2D grounding and 3D grounding for spatial reasoning and embodied AI), Long Context & Video Understanding (native 256K context expandable to 1M, handles hours-long video with second-level indexing), Enhanced Multimodal Reasoning (excels in STEM/Math with causal analysis), Upgraded Visual Recognition (celebrities, anime, products, landmarks, flora/fauna), and Expanded OCR (32 languages, robust in low light/blur/tilt). Architecture innovations include Interleaved-MRoPE for positional embeddings, DeepStack for multi-level ViT feature fusion, and Text-Timestamp Alignment for precise video temporal modeling.

发布日期

2025-09-23

参数规模

236.0B

上下文长度

131K

支持模态

image, text, video

能力雷达图

general

coding

reasoning

science估算

agents

100

multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域	#排名	分数	来源
智能体能力模型榜	19	66.0	LS
代码能力榜	158	55.0	AA
通用能力榜	165	55.0	AA
数学推理	49	89.0	AA
多模态榜	73	67.0	LS
推理能力	40	75.0	LS
科学能力	155	54.0	AA

基准测试分数 (LLM Stats)

3d

Objectron

0.71 / 100自报

BLINK

67.1%自报

ARKitScenes

0.54 / 100自报

SUNRGBD

0.35 / 100自报

Hypersim

0.11 / 100自报

Agents

SIFO

0.77 / 100自报

BFCL-v3

71.9%自报

SIFO-Multiturn

0.71 / 100自报

OSWorld-G

0.68 / 100自报

OSWorld

38.1%自报

Chemistry

SuperGPQA

64.3%自报

Code

Design2Code

0.93 / 100自报

Communication

MM-MT-Bench

8.50 / 100自报

WritingBench

86.7%自报

Multi-IF

79.1%自报

Creativity

Creative Writing v3

85.7%自报

Embodied

EmbSpatialBench

0.84 / 100自报

RoboSpatialHome

0.74 / 100自报

Factuality

SimpleQA

44.4%自报

Finance

MMLU

90.6%自报

MMLU-Pro

83.8%自报

MMLU-ProX

80.6%自报

General

MMLU-Redux

93.7%自报

IFEval

88.2%自报

MMMUval

80.6%自报

Include

80.0%自报

LiveBench 20241125

79.6%自报

MMStar

78.7%自报

LiveCodeBench v6

70.1%自报

MMMU-Pro

69.3%自报

SimpleVQA

0.61 / 100自报

Grounding

ScreenSpot

95.4%自报

RefCOCO-avg

0.92 / 100自报

RefSpatialBench

0.70 / 100自报

ScreenSpot Pro

61.8%自报

Healthcare

VideoMMMU

80.0%自报

Image To Text

OCRBench

87.5%自报

OCRBench-V2 (en)

66.8%自报

OCRBench-V2 (zh)

63.5%自报

Instruction Following

MIABench

0.93 / 100自报

Language

CharadesSTA

63.5%自报

Long Context

MLVU

83.8%自报

LVBench

63.6%自报

MMLongBench-Doc

0.56 / 100自报

Math

AIME 2025

89.7%自报

MathVista-Mini

85.8%自报

MathVerse-Mini

0.85 / 100自报

HMMT25

77.4%自报

MathVision

74.6%自报

Humanity's Last Exam

13.6%自报

Multimodal

DocVQAtest

96.5%自报

MMBench-V1.1

90.6%自报

InfoVQAtest

89.5%自报

AI2D

89.2%自报

CC-OCR

81.5%自报

MuirBench

80.1%自报

VideoMME w/o sub.

79.0%自报

CharXiv-R

66.1%自报

VisuLogic

0.34 / 100自报

ZEROBench-Sub

0.28 / 100自报

ZEROBench

0.04 / 100自报

Reasoning

ZebraLogic

97.3%自报

CountBench

0.94 / 100自报

Hallusion Bench

66.7%自报

ERQA

52.5%自报

Spatial Reasoning

RealWorldQA

81.3%自报

Vision

ODinW

43.2%自报

AA 评测指数

Math Index

88.3

Intelligence Index

20.6

Aime 25

0.9

Mmlu Pro

0.8

Gpqa

0.8

Livecodebench

0.6

Lcr

0.6

Ifbench

0.6

Tau2

0.5

Scicode

0.4

Terminalbench Hard

0.1

Hle

0.1

LLM Stats 分类评分

Communication

Multimodal

100

Creativity

Writing

Instruction Following

Language

Legal

Math

Structured Output

Embodied

Finance

Grounding

Healthcare

Text-to-image

Video

Image To Text

Long Context

Reasoning

Spatial Reasoning

General

Tool Calling

Vision

Physics

Agents

Chemistry

Economics

Factuality

定价

输入价格$0.84 / 1M tokens

输出价格$6.175 / 1M tokens

混合价格(3:1)$2.174 / 1M tokens

速度

Tokens/秒57.2

首Token延迟1.16s

首回答延迟36.11s

供应商价格排行

10 个供应商

最便宜: Venice AI最贵: NovitaAI

供应商输入输出

1Venice AI最便宜

$0.25

$1.5

2OpenRouter

$0.26

$2.6

3Kilo Gateway

$0.26

$2.6

4Alibaba (China)

$0.28671

$1.14682

5SiliconFlow (China)

$0.45

$3.5

6SiliconFlow

$0.45

$3.5

7NanoGPT

$0.5

8LLM Gateway

$0.5

9Alibaba

$0.7

$2.8

10NovitaAI

$0.98

$3.95

比较该模型在不同 API 供应商之间的定价。

外部链接

LLM Stats Artificial Analysis