DeepSeek-V2.5 (Dec '24)
DeepSeekDeepSeekOpen Weightdeepseek
Description
DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct, integrating general and coding abilities. It better aligns with human preferences and has been optimized in various aspects, including writing and instruction following.
Release Date
2024-12-10
Parameters
236.0B
Context Length
164K
Modalities
text
Capability Radar
13
general
60
coding
76
reasoning
68
scienceest.
0
agents
0
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| General Ranking | 471 | 14.0 | AA |
| Math Reasoning | 104 | 75.0 | AA |
| Reasoning | 49 | 69.0 | LS |
Benchmark Scores (LLM Stats)
Code
HumanEval
89.0%SR
Aider
72.2%SR
SWE-Bench Verified
16.8%SR
Communication
MT-Bench
0.90 / 100SR
Creativity
AlignBench
80.4%SR
Arena Hard
76.2%SR
AlpacaEval 2.0
50.5%SR
Finance
MMLU
80.4%SR
General
DS-FIM-Eval
78.3%SR
LiveCodeBench(01-09)
41.8%SR
Language
BBH
84.3%SR
Math
GSM8k
95.1%SR
MATH
74.7%SR
Reasoning
HumanEval-Mul
73.8%SR
DS-Arena-Code
63.1%SR
AA Evaluation Indices
Intelligence Index12.5
Math 5000.8
LLM Stats Category Scores
Communication90
Roleplay90
Finance80
General80
Healthcare80
Language80
Legal80
Math80
Writing70
Creativity70
Reasoning70
Code60
Frontend Development20
Pricing
Input PriceFree
Output PriceFree
Blended Price (3:1)Free
Speed
Tokens/sec0.0 tokens/s
Time to First Token0.00s
Time to Answer0.00s
Available Providers
(LS internal units)No provider data available