DeepSeek V3.1 (Non-reasoning)
Description
DeepSeek-V3.1 is a hybrid model supporting both thinking and non-thinking modes through different chat templates. Built on DeepSeek-V3.1-Base with a two-phase long context extension (32K phase: 630B tokens, 128K phase: 209B tokens), it features 671B total parameters with 37B activated. Key improvements include smarter tool calling through post-training optimization, higher thinking efficiency achieving comparable quality to DeepSeek-R1-0528 while responding more quickly, and UE8M0 FP8 scale data format for model weights and activations. The model excels in both reasoning tasks (thinking mode) and practical applications (non-thinking mode), with particularly strong performance in code agent tasks, math competitions, and search-based problem solving.
Capability Radar
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Code Ranking | 164 | 54.0 | AA |
| General Ranking | 224 | 46.0 | AA |
| Math Reasoning | 183 | 50.0 | AA |
| Science | 197 | 49.0 | AA |
Benchmark Scores (LLM Stats)
Agents
Biology
Code
Factuality
Finance
General
Math
Reasoning
AA Evaluation Indices
LLM Stats Category Scores
Pricing
Speed
Provider Price Ranking
Provider Price Ranking
17 providers
Compare pricing across different API providers for this model.