DeepSeek R1 Zero
DeepSeekDeepSeekOpen WeightMIT · Commercial OK
描述
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
发布日期
2025-01-20
参数规模
671.0B
上下文长度
—
支持模态
—
能力雷达图
60
general
50
coding
90
reasoning
60
science估算
0
agents
0
multimodal
Science 在缺少专门科学评测时使用推理能力代理估算。
排行榜排名
暂无排名数据
基准测试分数 (LLM Stats)
Biology
GPQA
73.3%自报
Code
LiveCodeBench
50.0%自报
Math
MATH-500
95.9%自报
AIME 2024
86.7%自报
AA 评测指数
暂无 AA 评测数据
LLM Stats 分类评分
Math90
Reasoning80
Biology70
Chemistry70
Physics70
General60
Code50
定价
暂无定价数据
速度
暂无速度数据
可用提供商
(LS 内部计价单位)暂无提供商数据