Phi-3.5-MoE-instruct

MicrosoftPhiOpen WeightMIT · Commercial OK

Описание

Phi-3.5-MoE-instruct is a mixture-of-experts model with ~42B total parameters (6.6B active) and a 128K context window. It excels at reasoning, math, coding, and multilingual tasks, outperforming larger dense models in many benchmarks. It underwent a thorough safety post-training process (SFT + DPO) and is licensed under MIT. This model is ideal for scenarios where efficiency and high performance are both required, particularly in multi-lingual or reasoning-intensive tasks.

Дата выхода

2024-08-23

Параметры

60.0B

Длина контекста

—

Модальности

—

Радар способностей

general

coding

reasoning

scienceоцен.

agents

multimodal

Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.

Рейтинги

Домен	#Место	Оценка	Источник
Reasoning	21	84.0	LS

Оценки бенчмарков (LLM Stats)

Biology

GPQA

36.8%Сам.

Code

RepoQA

85.0%Сам.

HumanEval

70.7%Сам.

Creativity

Social IQa

78.0%Сам.

Arena Hard

37.9%Сам.

Finance

MMLU

78.9%Сам.

TruthfulQA

77.5%Сам.

MMLU-Pro

45.3%Сам.

General

ARC-C

91.0%Сам.

OpenBookQA

89.6%Сам.

PIQA

88.6%Сам.

MBPP

0.81 / 100Сам.

MMMLU

69.9%Сам.

Language

BoolQ

84.6%Сам.

MEGA XStoryCloze

82.8%Сам.

Winogrande

81.3%Сам.

BIG-Bench Hard

79.1%Сам.

MEGA XCOPA

76.6%Сам.

MEGA TyDi QA

67.1%Сам.

MEGA MLQA

65.3%Сам.

MEGA UDPOS

60.4%Сам.

SQuALITY

24.1%Сам.

Long Context

RULER

87.1%Сам.

Qasper

40.0%Сам.

GovReport

26.4%Сам.

QMSum

19.9%Сам.

SummScreenFD

16.9%Сам.

Math

GSM8k

88.7%Сам.

MATH

59.5%Сам.

MGSM

58.7%Сам.

Reasoning

HellaSwag

83.8%Сам.

Индексы оценки AA

Нет данных AA оценки

Оценки категорий LLM Stats

Psychology

Code

Finance

General

Healthcare

Language

Legal

Math

Reasoning

Creativity

Long Context

Physics

Writing

Biology

Chemistry

Summarization

Цены

Нет данных о ценах

Скорость

Нет данных о скорости

Доступные провайдеры

(Внутренние единицы LS)

Нет данных провайдеров

Внешние ссылки

LLM Stats