DeepSeek R1 Zero

DeepSeekDeepSeekओपन वेटMIT · व्यावसायिक उपयोग

विवरण

DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

रिलीज़ तिथि

2025-01-20

पैरामीटर

671.0B

संदर्भ लंबाई

—

मोडैलिटीज़

—

क्षमता रडार

general

coding

reasoning

scienceअनुमानित

agents

multimodal

समर्पित विज्ञान बेंचमार्क उपलब्ध न होने पर Science तर्क प्रॉक्सी का उपयोग करके अनुमान लगाता है।

रैंकिंग

कोई रैंकिंग डेटा उपलब्ध नहीं

बेंचमार्क स्कोर (LLM Stats)

Biology

GPQA

73.3%स्वयं

Code

LiveCodeBench

50.0%स्वयं

Math

MATH-500

95.9%स्वयं

AIME 2024

86.7%स्वयं

AA मूल्यांकन सूचकांक

कोई AA मूल्यांकन डेटा उपलब्ध नहीं

LLM Stats श्रेणी स्कोर

Math

Reasoning

Physics

Biology

Chemistry

General

Code

मूल्य निर्धारण

कोई मूल्य डेटा उपलब्ध नहीं

गति

कोई गति डेटा उपलब्ध नहीं

प्रदाता मूल्य रैंकिंग

कोई प्रदाता डेटा उपलब्ध नहीं

बाहरी लिंक

LLM Stats Artificial Analysis