DeepSeek V3.1 (Reasoning)
Descripción
DeepSeek-V3.1 is a hybrid model supporting both thinking and non-thinking modes through different chat templates. Built on DeepSeek-V3.1-Base with a two-phase long context extension (32K phase: 630B tokens, 128K phase: 209B tokens), it features 671B total parameters with 37B activated. Key improvements include smarter tool calling through post-training optimization, higher thinking efficiency achieving comparable quality to DeepSeek-R1-0528 while responding more quickly, and UE8M0 FP8 scale data format for model weights and activations. The model excels in both reasoning tasks (thinking mode) and practical applications (non-thinking mode), with particularly strong performance in code agent tasks, math competitions, and search-based problem solving.
Radar de capacidades
Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.
Rankings
| Dominio | #Posición | Puntuación | Fuente |
|---|---|---|---|
| Capacidad agéntica | 116 | 31.0 | LS |
| Ranking de codificación | 103 | 65.0 | AA |
| Ranking general | 210 | 48.0 | AA |
| Razonamiento matemático | 35 | 91.0 | AA |
| Razonamiento | 93 | 49.0 | LS |
| Ciencia | 137 | 56.0 | AA |
Puntuaciones de benchmarks (LLM Stats)
Agents
Biology
Code
Factuality
Finance
General
Math
Reasoning
Índices de evaluación AA
Puntuaciones por categoría LLM Stats
Precios
Velocidad
Ranking de Precios por Proveedor
Ranking de Precios por Proveedor
3 proveedores
Comparar precios entre diferentes proveedores de API para este modelo.