Phi-3.5-MoE-instruct
MicrosoftPhi开源权重MIT · 商用许可
描述
Phi-3.5-MoE-instruct is a mixture-of-experts model with ~42B total parameters (6.6B active) and a 128K context window. It excels at reasoning, math, coding, and multilingual tasks, outperforming larger dense models in many benchmarks. It underwent a thorough safety post-training process (SFT + DPO) and is licensed under MIT. This model is ideal for scenarios where efficiency and high performance are both required, particularly in multi-lingual or reasoning-intensive tasks.
发布日期
2024-08-23
参数规模
60.0B
上下文长度
—
支持模态
—
能力雷达图
70
general
70
coding
70
reasoning
34
science估算
70
agents
0
multimodal
Science 在缺少专门科学评测时使用推理能力代理估算。
排行榜排名
| 领域 | #排名 | 分数 | 来源 |
|---|---|---|---|
| 推理能力 | 22 | 84.0 | LS |
基准测试分数 (LLM Stats)
Biology
GPQA
36.8%自报
Code
RepoQA
85.0%自报
HumanEval
70.7%自报
Creativity
Social IQa
78.0%自报
Arena Hard
37.9%自报
Finance
MMLU
78.9%自报
TruthfulQA
77.5%自报
MMLU-Pro
45.3%自报
General
ARC-C
91.0%自报
OpenBookQA
89.6%自报
PIQA
88.6%自报
MBPP
0.81 / 100自报
MMMLU
69.9%自报
Language
BoolQ
84.6%自报
MEGA XStoryCloze
82.8%自报
Winogrande
81.3%自报
BIG-Bench Hard
79.1%自报
MEGA XCOPA
76.6%自报
MEGA TyDi QA
67.1%自报
MEGA MLQA
65.3%自报
MEGA UDPOS
60.4%自报
SQuALITY
24.1%自报
Long Context
RULER
87.1%自报
Qasper
40.0%自报
GovReport
26.4%自报
QMSum
19.9%自报
SummScreenFD
16.9%自报
Math
GSM8k
88.7%自报
MATH
59.5%自报
MGSM
58.7%自报
Reasoning
HellaSwag
83.8%自报
AA 评测指数
暂无 AA 评测数据
LLM Stats 分类评分
Psychology80
Language70
Legal70
Math70
Reasoning70
Finance70
General70
Healthcare70
Code70
Long Context60
Physics60
Creativity60
Biology40
Chemistry40
Writing40
Summarization20
定价
暂无定价数据
速度
暂无速度数据
供应商价格排行
供应商价格排行
2 个供应商
最便宜: Azure Cognitive Services最贵: Azure
供应商输入输出
1Azure Cognitive Services最便宜
$0.16
$0.64
2Azure
$0.16
$0.64
比较该模型在不同 API 供应商之间的定价。