Qwen3 235B A22B 2507 (Reasoning)
AlibabaQwenOpen WeightApache 2.0 · Commercial OK
描述
Qwen3-235B-A22B-Thinking-2507 is a state-of-the-art thinking-enabled Mixture-of-Experts (MoE) model with 235B total parameters (22B activated). It features 94 layers, 128 experts (8 activated), and supports 262K native context length. This version delivers significantly improved reasoning performance, achieving state-of-the-art results among open-source thinking models on logical reasoning, mathematics, science, coding, and academic benchmarks. Key enhancements include markedly better general capabilities (instruction following, tool usage, text generation), enhanced 256K long-context understanding, and increased thinking depth. The model supports only thinking mode with automatic <think> tag inclusion.
发布日期
2025-07-25
参数规模
235.0B
上下文长度
262K
支持模态
text
能力雷达图
44
general
45
coding
92
reasoning
53
science估算
60
agents
0
multimodal
Science 在缺少专门科学评测时使用推理能力代理估算。
排行榜排名
基准测试分数 (LLM Stats)
Agents
BFCL-v3
71.9%自报
Biology
GPQA
81.1%自报
Chemistry
SuperGPQA
64.9%自报
Code
CFEval
2134.00 / 10000自报
Communication
WritingBench
88.3%自报
Multi-IF
80.6%自报
Tau2 Retail
71.9%自报
TAU-bench Retail
67.8%自报
Tau2 Airline
58.0%自报
TAU-bench Airline
46.0%自报
Tau2 Telecom
45.6%自报
Creativity
Creative Writing v3
86.1%自报
Arena-Hard v2
79.7%自报
Finance
MMLU-Pro
84.4%自报
MMLU-ProX
81.0%自报
General
MMLU-Redux
93.8%自报
IFEval
87.8%自报
Include
81.0%自报
LiveBench 20241125
78.4%自报
LiveCodeBench v6
74.1%自报
Math
AIME 2025
92.3%自报
HMMT25
83.9%自报
PolyMATH
60.1%自报
Humanity's Last Exam
18.2%自报
Reasoning
OJBench
32.5%自报
AA 评测指数
Math Index91.0
Intelligence Index29.5
Coding Index23.2
Math 5001.0
Aime0.9
Aime 250.9
Mmlu Pro0.8
Gpqa0.8
Livecodebench0.8
Lcr0.7
Tau20.5
Ifbench0.5
Scicode0.4
Hle0.1
Terminalbench Hard0.1
LLM Stats 分类评分
Structured Output80
Writing80
Biology80
Creativity80
Finance80
General80
Healthcare80
Instruction Following80
Language80
Legal80
Agents70
Chemistry70
Communication70
Math70
Physics70
Reasoning70
Spatial Reasoning60
Tool Calling60
Economics60
Multimodal60
Vision40
定价
输入价格$0.4 / 1M tokens
输出价格$2.15 / 1M tokens
混合价格(3:1)$0.838 / 1M tokens
速度
Tokens/秒58.5 tokens/s
首Token延迟1.22s
首回答延迟35.39s
可用提供商
(LS 内部计价单位)| 提供商 | 输入价格 | 输出价格 |
|---|---|---|
| Fireworks | 300K | 3.0M |
| Novita | 300K | 3.0M |