Qwen3 VL 235B A22B (Reasoning)

AlibabaQwen開源權重Apache 2.0 · 商用許可

描述

Qwen3-VL-235B-A22B-Thinking is the most powerful vision-language model in the Qwen series, featuring 236B parameters with MoE architecture for reasoning-enhanced multimodal understanding. Key capabilities include: Visual Agent (operates PC/mobile GUIs, recognizes elements, invokes tools), Visual Coding (generates Draw.io/HTML/CSS/JS from images/videos), Advanced Spatial Perception (2D grounding and 3D grounding for spatial reasoning and embodied AI), Long Context & Video Understanding (native 256K context expandable to 1M, handles hours-long video with second-level indexing), Enhanced Multimodal Reasoning (excels in STEM/Math with causal analysis), Upgraded Visual Recognition (celebrities, anime, products, landmarks, flora/fauna), and Expanded OCR (32 languages, robust in low light/blur/tilt). Architecture innovations include Interleaved-MRoPE for positional embeddings, DeepStack for multi-level ViT feature fusion, and Text-Timestamp Alignment for precise video temporal modeling.

發布日期

2025-09-23

參數規模

236.0B

上下文長度

131K

支援模態

image, text, video

能力雷達圖

general

coding

reasoning

science估算

agents

100

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
智慧體能力模型榜	19	66.0	LS
程式碼能力榜	158	55.0	AA
通用能力榜	165	55.0	AA
數學推理	49	89.0	AA
多模態榜	73	67.0	LS
推理能力	40	75.0	LS
科學能力	155	54.0	AA

基準測試分數 (LLM Stats)

3d

Objectron

0.71 / 100自報

BLINK

67.1%自報

ARKitScenes

0.54 / 100自報

SUNRGBD

0.35 / 100自報

Hypersim

0.11 / 100自報

Agents

SIFO

0.77 / 100自報

BFCL-v3

71.9%自報

SIFO-Multiturn

0.71 / 100自報

OSWorld-G

0.68 / 100自報

OSWorld

38.1%自報

Chemistry

SuperGPQA

64.3%自報

Code

Design2Code

0.93 / 100自報

Communication

MM-MT-Bench

8.50 / 100自報

WritingBench

86.7%自報

Multi-IF

79.1%自報

Creativity

Creative Writing v3

85.7%自報

Embodied

EmbSpatialBench

0.84 / 100自報

RoboSpatialHome

0.74 / 100自報

Factuality

SimpleQA

44.4%自報

Finance

MMLU

90.6%自報

MMLU-Pro

83.8%自報

MMLU-ProX

80.6%自報

General

MMLU-Redux

93.7%自報

IFEval

88.2%自報

MMMUval

80.6%自報

Include

80.0%自報

LiveBench 20241125

79.6%自報

MMStar

78.7%自報

LiveCodeBench v6

70.1%自報

MMMU-Pro

69.3%自報

SimpleVQA

0.61 / 100自報

Grounding

ScreenSpot

95.4%自報

RefCOCO-avg

0.92 / 100自報

RefSpatialBench

0.70 / 100自報

ScreenSpot Pro

61.8%自報

Healthcare

VideoMMMU

80.0%自報

Image To Text

OCRBench

87.5%自報

OCRBench-V2 (en)

66.8%自報

OCRBench-V2 (zh)

63.5%自報

Instruction Following

MIABench

0.93 / 100自報

Language

CharadesSTA

63.5%自報

Long Context

MLVU

83.8%自報

LVBench

63.6%自報

MMLongBench-Doc

0.56 / 100自報

Math

AIME 2025

89.7%自報

MathVista-Mini

85.8%自報

MathVerse-Mini

0.85 / 100自報

HMMT25

77.4%自報

MathVision

74.6%自報

Humanity's Last Exam

13.6%自報

Multimodal

DocVQAtest

96.5%自報

MMBench-V1.1

90.6%自報

InfoVQAtest

89.5%自報

AI2D

89.2%自報

CC-OCR

81.5%自報

MuirBench

80.1%自報

VideoMME w/o sub.

79.0%自報

CharXiv-R

66.1%自報

VisuLogic

0.34 / 100自報

ZEROBench-Sub

0.28 / 100自報

ZEROBench

0.04 / 100自報

Reasoning

ZebraLogic

97.3%自報

CountBench

0.94 / 100自報

Hallusion Bench

66.7%自報

ERQA

52.5%自報

Spatial Reasoning

RealWorldQA

81.3%自報

Vision

ODinW

43.2%自報

AA 評測指數

Math Index

88.3

Intelligence Index

20.6

Aime 25

0.9

Mmlu Pro

0.8

Gpqa

0.8

Livecodebench

0.6

Lcr

0.6

Ifbench

0.6

Tau2

0.5

Scicode

0.4

Terminalbench Hard

0.1

Hle

0.1

LLM Stats 分類評分

Communication

Multimodal

100

Creativity

Writing

Instruction Following

Language

Legal

Math

Structured Output

Embodied

Finance

Grounding

Healthcare

Text-to-image

Video

Image To Text

Long Context

Reasoning

Spatial Reasoning

General

Tool Calling

Vision

Physics

Agents

Chemistry

Economics

Factuality

定價

輸入價格$0.84 / 1M tokens

輸出價格$6.175 / 1M tokens

混合價格(3:1)$2.174 / 1M tokens

速度

Tokens/秒57.2

首Token延遲1.16s

首回答延遲36.11s

供應商價格排行

10 個供應商

最便宜: Venice AI最貴: NovitaAI

供應商輸入輸出

1Venice AI最便宜

$0.25

$1.5

2OpenRouter

$0.26

$2.6

3Kilo Gateway

$0.26

$2.6

4Alibaba (China)

$0.28671

$1.14682

5SiliconFlow (China)

$0.45

$3.5

6SiliconFlow

$0.45

$3.5

7NanoGPT

$0.5

8LLM Gateway

$0.5

9Alibaba

$0.7

$2.8

10NovitaAI

$0.98

$3.95

比較該模型在不同 API 供應商之間的定價。

外部連結

LLM Stats Artificial Analysis