Qwen3 8B (Reasoning)
AlibabaQwen
Дата выхода
2025-04-28
Параметры
—
Длина контекста
262K
Модальности
audio, image, text, video
Радар способностей
27
general
37
coding
48
reasoning
35
scienceоцен.
60
agents
80
multimodal
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Рейтинг кодинга | 406 | 16.0 | AA |
| Общий рейтинг | 351 | 32.0 | AA |
| Математическое мышление | 203 | 45.0 | AA |
| Наука | 341 | 34.0 | AA |
Оценки бенчмарков (LLM Stats)
3d
SUNRGBD
0.33 / 100Сам.
Hypersim
0.13 / 100Сам.
Agents
t2-bench
81.2%Сам.
AndroidWorld_SR
71.1%Сам.
BFCL-V4
67.3%Сам.
BrowseComp
61.0%Сам.
FullStackBench en
58.1%Сам.
WideSearch
57.1%Сам.
TIR-Bench
55.5%Сам.
FullStackBench zh
55.0%Сам.
OSWorld-Verified
54.5%Сам.
Terminal-Bench 2.0
40.5%Сам.
VITA-Bench
31.9%Сам.
DeepPlanning
22.8%Сам.
Biology
GPQA
84.2%Сам.
Chemistry
SuperGPQA
63.4%Сам.
Code
SWE-Bench Verified
69.2%Сам.
Communication
Multi-Challenge
60.0%Сам.
Embodied
EmbSpatialBench
0.83 / 100Сам.
Finance
MMLU-Pro
85.3%Сам.
MMLU-ProX
81.0%Сам.
General
MMLU-Redux
93.3%Сам.
IFEval
91.9%Сам.
C-Eval
90.2%Сам.
MAXIFE
86.6%Сам.
Global PIQA
86.6%Сам.
MMMLU
85.2%Сам.
MMStar
81.9%Сам.
MMMU
81.4%Сам.
Include
79.7%Сам.
MMMU-Pro
75.1%Сам.
LiveCodeBench v6
74.6%Сам.
IFBench
70.2%Сам.
LongBench v2
59.0%Сам.
SimpleVQA
0.58 / 100Сам.
NOVA-63
57.1%Сам.
Grounding
RefCOCO-avg
0.89 / 100Сам.
ScreenSpot Pro
68.6%Сам.
RefSpatialBench
0.64 / 100Сам.
Healthcare
VideoMMMU
80.4%Сам.
SlakeVQA
78.7%Сам.
PMC-VQA
62.0%Сам.
MedXpertQA
61.4%Сам.
Image To Text
OCRBench
91.0%Сам.
Language
LingoQA
79.2%Сам.
WMT24++
76.3%Сам.
Long Context
MLVU
85.6%Сам.
LVBench
71.4%Сам.
MMLongBench-Doc
0.59 / 100Сам.
AA-LCR
58.5%Сам.
Math
HMMT25
89.2%Сам.
HMMT 2025
89.0%Сам.
MathVista-Mini
86.2%Сам.
DynaMath
85.0%Сам.
MathVision
83.9%Сам.
CodeForces
0.82 / 3000Сам.
PolyMATH
64.4%Сам.
Humanity's Last Exam
47.4%Сам.
Multimodal
VLMsAreBlind
97.0%Сам.
V*
92.7%Сам.
AI2D
92.6%Сам.
MMBench-V1.1
91.5%Сам.
OmniDocBench 1.5
89.3%Сам.
VideoMME w sub.
86.6%Сам.
VideoMME w/o sub.
82.5%Сам.
CC-OCR
80.7%Сам.
CharXiv-R
77.5%Сам.
MVBench
74.8%Сам.
MMVU
72.3%Сам.
BabyVision
38.4%Сам.
ZEROBench-Sub
0.34 / 100Сам.
Nuscene
14.6%Сам.
ZEROBench
0.08 / 100Сам.
Reasoning
CountBench
0.98 / 100Сам.
BrowseComp-zh
69.5%Сам.
Hallusion Bench
67.9%Сам.
ERQA
64.8%Сам.
Seal-0
41.4%Сам.
OJBench
36.0%Сам.
Spatial Reasoning
RealWorldQA
84.1%Сам.
Vision
ODinW
42.6%Сам.
Индексы оценки AA
Math Index19.0
Intelligence Index7.4
Math 5000.9
Aime0.7
Mmlu Pro0.7
Gpqa0.6
Livecodebench0.4
Ifbench0.3
Tau20.3
Scicode0.2
Aime 250.2
Hle0.0
Terminalbench Hard0.0
Lcr0.0
Оценки категорий LLM Stats
Math80
Physics80
Structured Output80
Image To Text80
Instruction Following80
Language80
Legal80
Embodied80
Finance80
General80
Biology80
Text-to-image80
Video80
Multimodal70
Reasoning70
Spatial Reasoning70
Long Context70
Frontend Development70
Grounding70
Healthcare70
Chemistry70
Vision70
Search60
Code60
Communication60
Economics60
Tool Calling60
Agents50
3d20
Spatial10
Цены
Цена ввода$0.11 / 1M токенов
Цена вывода$1.15 / 1M токенов
Смешанная цена (3:1)$0.37 / 1M токенов
Скорость
Токенов/сек61.4
Задержка первого токена1.35s
Время до первого ответа33.91s
Рейтинг цен провайдеров
Рейтинг цен провайдеров
1 провайдеров
ПровайдерВводВывод
1AlibabaОсновной
$0.11
$1.15
Сравнение цен разных API-провайдеров для этой модели.