DeepSeek V3 0324

DeepSeekDeepSeekOpen WeightMIT + Model License (Commercial use allowed)

설명

A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.

출시일

2025-03-25

파라미터

671.0B

컨텍스트 길이

164K

모달리티

text

능력 레이더

general

coding

reasoning

science추정

agents

multimodal

전용 과학 벤치마크가 없을 때 Science는 추론 프록시를 사용하여 추정합니다.

랭킹

도메인	#순위	점수	소스
Code Ranking	217	39.0	AA
General Ranking	209	49.0	AA
Math Reasoning	164	54.0	AA
Science	232	45.0	AA

벤치마크 점수 (LLM Stats)

Biology

GPQA

68.4%자체 보고

Code

LiveCodeBench

49.2%자체 보고

Finance

MMLU-Pro

81.2%자체 보고

Math

MATH-500

94.0%자체 보고

AIME 2024

59.4%자체 보고

AA 평가 지수

Math Index

41.0

Intelligence Index

22.3

Coding Index

22.0

Math 500

0.9

Mmlu Pro

0.8

Gpqa

0.7

Aime

0.5

Tau2

0.5

Aime 25

0.4

Ifbench

0.4

Lcr

0.4

Livecodebench

0.4

Scicode

0.4

Terminalbench Hard

0.2

Hle

0.1

LLM Stats 카테고리 점수

Finance

Healthcare

Language

Legal

Math

Biology

Chemistry

General

Physics

Reasoning

Code

가격

입력 가격$1.195 / 1M tokens

출력 가격$1.25 / 1M tokens

혼합 가격 (3:1)$1.209 / 1M tokens

속도

토큰/초0.0 tokens/s

첫 토큰 지연0.00s

첫 응답 지연0.00s

사용 가능한 프로바이더

(LS 내부 단위)

프로바이더	입력 가격	출력 가격
Novita	280K	1.1M

외부 링크

LLM Stats