DeepSeek V3 0324

DeepSeekDeepSeekOpen WeightMIT + Model License (Commercial use allowed)

描述

A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.

發布日期

2025-03-25

參數規模

671.0B

上下文長度

164K

支援模態

text

能力雷達圖

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
代码能力榜	217	39.0	AA
通用能力榜	209	49.0	AA
数学推理	164	54.0	AA
科学能力	232	45.0	AA

基準測試分數 (LLM Stats)

Biology

GPQA

68.4%自報

Code

LiveCodeBench

49.2%自報

Finance

MMLU-Pro

81.2%自報

Math

MATH-500

94.0%自報

AIME 2024

59.4%自報

AA 評測指數

Math Index

41.0

Intelligence Index

22.3

Coding Index

22.0

Math 500

0.9

Mmlu Pro

0.8

Gpqa

0.7

Aime

0.5

Tau2

0.5

Aime 25

0.4

Ifbench

0.4

Lcr

0.4

Livecodebench

0.4

Scicode

0.4

Terminalbench Hard

0.2

Hle

0.1

LLM Stats 分類評分

Finance

Healthcare

Language

Legal

Math

Biology

Chemistry

General

Physics

Reasoning

Code

定價

輸入價格$1.195 / 1M tokens

輸出價格$1.25 / 1M tokens

混合價格(3:1)$1.209 / 1M tokens

速度

Tokens/秒0.0 tokens/s

首Token延遲0.00s

首回答延遲0.00s

可用提供商

(LS 內部計價單位)

提供商	輸入價格	輸出價格
Novita	280K	1.1M

外部連結

LLM Stats