Phi-3.5-MoE-instruct

MicrosoftPhiOpen WeightMIT · Commercial OK

설명

Phi-3.5-MoE-instruct is a mixture-of-experts model with ~42B total parameters (6.6B active) and a 128K context window. It excels at reasoning, math, coding, and multilingual tasks, outperforming larger dense models in many benchmarks. It underwent a thorough safety post-training process (SFT + DPO) and is licensed under MIT. This model is ideal for scenarios where efficiency and high performance are both required, particularly in multi-lingual or reasoning-intensive tasks.

출시일

2024-08-23

파라미터

60.0B

컨텍스트 길이

—

모달리티

—

능력 레이더

general

coding

reasoning

science추정

agents

multimodal

전용 과학 벤치마크가 없을 때 Science는 추론 프록시를 사용하여 추정합니다.

랭킹

도메인	#순위	점수	소스
Reasoning	21	84.0	LS

벤치마크 점수 (LLM Stats)

Biology

GPQA

36.8%자체 보고

Code

RepoQA

85.0%자체 보고

HumanEval

70.7%자체 보고

Creativity

Social IQa

78.0%자체 보고

Arena Hard

37.9%자체 보고

Finance

MMLU

78.9%자체 보고

TruthfulQA

77.5%자체 보고

MMLU-Pro

45.3%자체 보고

General

ARC-C

91.0%자체 보고

OpenBookQA

89.6%자체 보고

PIQA

88.6%자체 보고

MBPP

0.81 / 100자체 보고

MMMLU

69.9%자체 보고

Language

BoolQ

84.6%자체 보고

MEGA XStoryCloze

82.8%자체 보고

Winogrande

81.3%자체 보고

BIG-Bench Hard

79.1%자체 보고

MEGA XCOPA

76.6%자체 보고

MEGA TyDi QA

67.1%자체 보고

MEGA MLQA

65.3%자체 보고

MEGA UDPOS

60.4%자체 보고

SQuALITY

24.1%자체 보고

Long Context

RULER

87.1%자체 보고

Qasper

40.0%자체 보고

GovReport

26.4%자체 보고

QMSum

19.9%자체 보고

SummScreenFD

16.9%자체 보고

Math

GSM8k

88.7%자체 보고

MATH

59.5%자체 보고

MGSM

58.7%자체 보고

Reasoning

HellaSwag

83.8%자체 보고

AA 평가 지수

AA 평가 데이터가 없습니다

LLM Stats 카테고리 점수

Psychology

Code

Finance

General

Healthcare

Language

Legal

Math

Reasoning

Creativity

Long Context

Physics

Writing

Biology

Chemistry

Summarization

가격

가격 데이터가 없습니다

속도

속도 데이터가 없습니다

사용 가능한 프로바이더

(LS 내부 단위)

프로바이더 데이터가 없습니다

외부 링크

LLM Stats