Qwen Chat 72B
AlibabaQwen
Дата выхода
2023-11-30
Параметры
—
Длина контекста
262K
Модальности
audio, image, text, video
Радар способностей
3
general
60
coding
80
reasoning
77
scienceоцен.
60
agents
80
multimodal
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Общий рейтинг | 528 | 4.0 | AA |
Оценки бенчмарков (LLM Stats)
3d
SUNRGBD
0.36 / 100Сам.
Hypersim
0.13 / 100Сам.
Agents
GDPval-AA
985.00 / 3000Сам.
t2-bench
79.5%Сам.
BFCL-V4
72.2%Сам.
AndroidWorld_SR
66.4%Сам.
BrowseComp
63.8%Сам.
FullStackBench en
62.6%Сам.
WideSearch
60.5%Сам.
FullStackBench zh
58.7%Сам.
OSWorld-Verified
58.0%Сам.
TIR-Bench
53.2%Сам.
Terminal-Bench 2.0
49.4%Сам.
VITA-Bench
33.6%Сам.
DeepPlanning
24.1%Сам.
Biology
GPQA
86.6%Сам.
Chemistry
SuperGPQA
67.1%Сам.
Code
SWE-Bench Verified
72.0%Сам.
Communication
Multi-Challenge
61.5%Сам.
Embodied
EmbSpatialBench
0.84 / 100Сам.
Finance
MMLU-Pro
86.7%Сам.
MMLU-ProX
82.2%Сам.
General
MMLU-Redux
94.0%Сам.
IFEval
93.4%Сам.
C-Eval
91.9%Сам.
Global PIQA
88.4%Сам.
MAXIFE
87.9%Сам.
MMMLU
86.7%Сам.
MMMU
83.9%Сам.
MMStar
82.9%Сам.
Include
82.8%Сам.
LiveCodeBench v6
78.9%Сам.
MMMU-Pro
76.9%Сам.
IFBench
76.1%Сам.
SimpleVQA
0.62 / 100Сам.
LongBench v2
60.2%Сам.
NOVA-63
58.6%Сам.
Grounding
RefCOCO-avg
0.91 / 100Сам.
ScreenSpot Pro
70.4%Сам.
RefSpatialBench
0.69 / 100Сам.
Healthcare
VideoMMMU
82.0%Сам.
SlakeVQA
81.6%Сам.
MedXpertQA
67.3%Сам.
PMC-VQA
63.3%Сам.
Image To Text
OCRBench
92.1%Сам.
Language
LingoQA
80.8%Сам.
WMT24++
78.3%Сам.
Long Context
MLVU
87.3%Сам.
LVBench
74.4%Сам.
AA-LCR
66.9%Сам.
MMLongBench-Doc
0.59 / 100Сам.
Math
HMMT 2025
91.4%Сам.
HMMT25
90.3%Сам.
MathVista-Mini
87.4%Сам.
MathVision
86.2%Сам.
DynaMath
85.9%Сам.
CodeForces
0.85 / 3000Сам.
PolyMATH
68.9%Сам.
Humanity's Last Exam
47.5%Сам.
Multimodal
VLMsAreBlind
96.7%Сам.
AI2D
93.3%Сам.
V*
93.2%Сам.
MMBench-V1.1
92.8%Сам.
OmniDocBench 1.5
89.8%Сам.
VideoMME w sub.
87.3%Сам.
VideoMME w/o sub.
83.9%Сам.
CC-OCR
81.8%Сам.
CharXiv-R
77.2%Сам.
MVBench
76.6%Сам.
MMVU
74.7%Сам.
BabyVision
40.2%Сам.
ZEROBench-Sub
0.36 / 100Сам.
Nuscene
15.4%Сам.
ZEROBench
0.09 / 100Сам.
Reasoning
CountBench
0.97 / 100Сам.
BrowseComp-zh
69.9%Сам.
Hallusion Bench
67.6%Сам.
ERQA
62.0%Сам.
Seal-0
44.1%Сам.
OJBench
39.5%Сам.
Spatial Reasoning
RealWorldQA
85.1%Сам.
Vision
ODinW
44.5%Сам.
Индексы оценки AA
Intelligence Index3.4
Оценки категорий LLM Stats
Legal100
Finance100
Agents76
General46
Reasoning19
Biology90
Image To Text80
Instruction Following80
Language80
Math80
Physics80
Structured Output80
Embodied80
Grounding80
Healthcare80
Chemistry80
Text-to-image80
Video80
Long Context70
Multimodal70
Spatial Reasoning70
Frontend Development70
Economics70
Vision70
Search60
Code60
Communication60
Tool Calling60
Spatial20
3d20
Цены
Цена вводаБесплатно
Цена выводаБесплатно
Смешанная цена (3:1)Бесплатно
Скорость
Токенов/сек0.0
Задержка первого токена0.00s
Время до первого ответа0.00s
Рейтинг цен провайдеров
Нет данных провайдеров