DeepSeek V3 0324

DeepSeekDeepSeekOpen WeightMIT + Model License (Commercial use allowed)

Descripción

A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.

Fecha de lanzamiento

2025-03-25

Parámetros

671.0B

Longitud del contexto

164K

Modalidades

text

Radar de capacidades

general

coding

reasoning

scienceest.

agents

multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio	#Posición	Puntuación	Fuente
Code Ranking	217	39.0	AA
General Ranking	209	49.0	AA
Math Reasoning	164	54.0	AA
Science	232	45.0	AA

Puntuaciones de benchmarks (LLM Stats)

Biology

GPQA

68.4%Aut.

Code

LiveCodeBench

49.2%Aut.

Finance

MMLU-Pro

81.2%Aut.

Math

MATH-500

94.0%Aut.

AIME 2024

59.4%Aut.

Índices de evaluación AA

Math Index

41.0

Intelligence Index

22.3

Coding Index

22.0

Math 500

0.9

Mmlu Pro

0.8

Gpqa

0.7

Aime

0.5

Tau2

0.5

Aime 25

0.4

Ifbench

0.4

Lcr

0.4

Livecodebench

0.4

Scicode

0.4

Terminalbench Hard

0.2

Hle

0.1

Puntuaciones por categoría LLM Stats

Finance

Healthcare

Language

Legal

Math

Biology

Chemistry

General

Physics

Reasoning

Code

Precios

Precio de entrada$1.195 / 1M tokens

Precio de salida$1.25 / 1M tokens

Precio mixto (3:1)$1.209 / 1M tokens

Velocidad

Tokens/seg0.0 tokens/s

Retraso del primer token0.00s

Tiempo hasta la respuesta0.00s

Proveedores disponibles

(Unidades internas LS)

Proveedor	Precio de entrada	Precio de salida
Novita	280K	1.1M

Fuentes externas

LLM Stats