Phi-3.5-MoE-instruct

MicrosoftPhiOpen WeightMIT · Commercial OK

Description

Phi-3.5-MoE-instruct is a mixture-of-experts model with ~42B total parameters (6.6B active) and a 128K context window. It excels at reasoning, math, coding, and multilingual tasks, outperforming larger dense models in many benchmarks. It underwent a thorough safety post-training process (SFT + DPO) and is licensed under MIT. This model is ideal for scenarios where efficiency and high performance are both required, particularly in multi-lingual or reasoning-intensive tasks.

Release Date

2024-08-23

Parameters

60.0B

Context Length

—

Modalities

—

Capability Radar

general

coding

reasoning

scienceest.

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Reasoning	21	84.0	LS

Benchmark Scores (LLM Stats)

Biology

GPQA

36.8%SR

Code

RepoQA

85.0%SR

HumanEval

70.7%SR

Creativity

Social IQa

78.0%SR

Arena Hard

37.9%SR

Finance

MMLU

78.9%SR

TruthfulQA

77.5%SR

MMLU-Pro

45.3%SR

General

ARC-C

91.0%SR

OpenBookQA

89.6%SR

PIQA

88.6%SR

MBPP

0.81 / 100SR

MMMLU

69.9%SR

Language

BoolQ

84.6%SR

MEGA XStoryCloze

82.8%SR

Winogrande

81.3%SR

BIG-Bench Hard

79.1%SR

MEGA XCOPA

76.6%SR

MEGA TyDi QA

67.1%SR

MEGA MLQA

65.3%SR

MEGA UDPOS

60.4%SR

SQuALITY

24.1%SR

Long Context

RULER

87.1%SR

Qasper

40.0%SR

GovReport

26.4%SR

QMSum

19.9%SR

SummScreenFD

16.9%SR

Math

GSM8k

88.7%SR

MATH

59.5%SR

MGSM

58.7%SR

Reasoning

HellaSwag

83.8%SR

AA Evaluation Indices

No AA evaluation data available

LLM Stats Category Scores

Psychology

Code

Finance

General

Healthcare

Language

Legal

Math

Reasoning

Creativity

Long Context

Physics

Writing

Biology

Chemistry

Summarization

Pricing

No pricing data available

Speed

No speed data available

Available Providers

(LS internal units)

No provider data available

External Sources

LLM Stats