DeepSeek V3 0324

DeepSeekDeepSeekOpen WeightMIT + Model License (Commercial use allowed)

Description

A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.

Date de sortie

2025-03-25

Paramètres

671.0B

Longueur du contexte

164K

Modalités

text

Radar de capacités

general

coding

reasoning

scienceest.

agents

multimodal

Science utilise un proxy de raisonnement lorsque les benchmarks scientifiques dédiés ne sont pas disponibles.

Classements

Domaine	#Rang	Score	Source
Code Ranking	217	39.0	AA
General Ranking	209	49.0	AA
Math Reasoning	164	54.0	AA
Science	232	45.0	AA

Scores de benchmarks (LLM Stats)

Biology

GPQA

68.4%Aut.

Code

LiveCodeBench

49.2%Aut.

Finance

MMLU-Pro

81.2%Aut.

Math

MATH-500

94.0%Aut.

AIME 2024

59.4%Aut.

Indices d'évaluation AA

Math Index

41.0

Intelligence Index

22.3

Coding Index

22.0

Math 500

0.9

Mmlu Pro

0.8

Gpqa

0.7

Aime

0.5

Tau2

0.5

Aime 25

0.4

Ifbench

0.4

Lcr

0.4

Livecodebench

0.4

Scicode

0.4

Terminalbench Hard

0.2

Hle

0.1

Scores par catégorie LLM Stats

Finance

Healthcare

Language

Legal

Math

Biology

Chemistry

General

Physics

Reasoning

Code

Tarification

Prix d'entrée$1.195 / 1M tokens

Prix de sortie$1.25 / 1M tokens

Prix mixte (3:1)$1.209 / 1M tokens

Vitesse

Tokens/sec0.0 tokens/s

Délai du premier token0.00s

Temps de réponse0.00s

Fournisseurs disponibles

(Unités internes LS)

Fournisseur	Prix d'entrée	Prix de sortie
Novita	280K	1.1M

Sources externes

LLM Stats