Qwen3.6 Plus
AlibabaQwenProprietary
Описание
Qwen3.6 Plus is Alibaba's next-generation flagship model featuring a 1 million token native context window, up to 65,536 output tokens, and always-on chain-of-thought reasoning. It uses a next-generation hybrid architecture optimized for efficiency and scalability. It leads on Terminal-Bench 2.0 agentic coding (61.6), surpassing Claude 4.5 Opus, and achieves strong results on document understanding (OmniDocBench 91.2) and multimodal reasoning (MMMU 86.0). Compared to Qwen 3.5, it is significantly more decisive in reasoning, using fewer tokens on straightforward tasks with better agent stability.
Дата выхода
2026-04-02
Параметры
—
Длина контекста
1.0M
Модальности
image, text, video
Радар способностей
45
general
43
coding
88
reasoning
59
scienceоцен.
60
agents
90
multimodal
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Agents & Tools | 44 | 58.0 | LS |
| Code Ranking | 31 | 78.0 | AA |
| General Ranking | 15 | 88.0 | AA |
| Multimodal Ranking | 14 | 87.0 | LS |
| Reasoning | 28 | 82.0 | LS |
| Science | 47 | 73.0 | AA |
Оценки бенчмарков (LLM Stats)
Agents
WideSearch
74.3%Сам.
MCP Atlas
74.1%Сам.
TAU3-Bench
70.7%Сам.
OSWorld-Verified
62.5%Сам.
TIR-Bench
61.6%Сам.
Terminal-Bench 2.0
61.6%Сам.
Claw-Eval
58.7%Сам.
SWE-Bench Pro
56.6%Сам.
MCP-Mark
48.2%Сам.
SkillsBench
45.7%Сам.
VITA-Bench
44.3%Сам.
DeepPlanning
41.5%Сам.
Toolathlon
39.8%Сам.
NL2Repo
37.9%Сам.
Biology
GPQA
90.4%Сам.
Chemistry
SuperGPQA
71.6%Сам.
Code
SWE-Bench Verified
78.8%Сам.
SWE-bench Multilingual
73.8%Сам.
Finance
MMLU-Pro
88.5%Сам.
MMLU-ProX
84.7%Сам.
General
MMLU-Redux
94.5%Сам.
IFEval
94.3%Сам.
C-Eval
93.3%Сам.
Global PIQA
89.8%Сам.
MMMLU
89.5%Сам.
MAXIFE
88.2%Сам.
LiveCodeBench v6
87.1%Сам.
MMMU
86.0%Сам.
Include
85.1%Сам.
MMStar
83.3%Сам.
MMMU-Pro
78.8%Сам.
IFBench
74.2%Сам.
SimpleVQA
0.67 / 100Сам.
LongBench v2
62.0%Сам.
NOVA-63
57.9%Сам.
Grounding
RefCOCO-avg
0.94 / 100Сам.
ScreenSpot Pro
68.2%Сам.
Healthcare
VideoMMMU
84.0%Сам.
Language
WMT24++
84.3%Сам.
Long Context
MLVU
86.7%Сам.
AA-LCR
68.3%Сам.
MMLongBench-Doc
0.62 / 100Сам.
Math
HMMT 2025
96.7%Сам.
AIME 2026
95.3%Сам.
HMMT25
94.6%Сам.
We-Math
89.0%Сам.
DynaMath
88.0%Сам.
MathVision
88.0%Сам.
HMMT Feb 26
87.8%Сам.
IMO-AnswerBench
83.8%Сам.
PolyMATH
77.4%Сам.
Humanity's Last Exam
28.8%Сам.
Multimodal
V*
96.9%Сам.
AI2D
94.4%Сам.
OmniDocBench 1.5
91.2%Сам.
Video-MME
84.2%Сам.
CC-OCR
83.4%Сам.
CharXiv-R
81.5%Сам.
Reasoning
CountBench
0.98 / 100Сам.
ERQA
65.7%Сам.
Spatial Reasoning
RealWorldQA
85.4%Сам.
Vision
ODinW
51.8%Сам.
Индексы оценки AA
Intelligence Index50.0
Coding Index42.9
Tau21.0
Gpqa0.9
Ifbench0.8
Lcr0.7
Terminalbench Hard0.4
Scicode0.4
Hle0.3
Оценки категорий LLM Stats
Video90
Biology90
Language90
Spatial Reasoning80
Structured Output80
Text-to-image80
Vision80
Chemistry80
Finance80
Frontend Development80
General80
Grounding80
Healthcare80
Instruction Following80
Legal80
Math80
Multimodal80
Physics80
Reasoning80
Code70
Economics70
Image To Text70
Long Context70
Search70
Tool Calling60
Agents60
Coding50
Цены
Цена ввода$0.5 / 1M tokens
Цена вывода$3 / 1M tokens
Смешанная цена (3:1)$1.125 / 1M tokens
Скорость
Токенов/сек52.7 tokens/s
Задержка первого токена1.69s
Время до первого ответа107.01s
Доступные провайдеры
(Внутренние единицы LS)| Провайдер | Цена ввода | Цена вывода |
|---|---|---|
| Together | 500K | 3.0M |