跳转到主要内容

gpt-oss-20B (high)

OpenAIOpen WeightApache 2.0 · Commercial OK

描述

The gpt-oss-20b model (technically 20.9B parameters) achieves near-parity with OpenAI o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU. The gpt-oss-20b model delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory, making it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure. Both models also perform strongly on tool use, few-shot function calling, CoT reasoning (as seen in results on the Tau-Bench agentic evaluation suite) and HealthBench (even outperforming proprietary models like OpenAI o1 and GPT‑4o). Note: While referred to as '20b' for simplicity, it technically has 20.9B parameters.

发布日期
2025-08-05
参数规模
20.9B
上下文长度
131K
支持模态
text

能力雷达图

37
general
41
coding
86
reasoning
45
science估算
50
agents
0
multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域#排名分数来源
代码能力榜196
41.0
AA
通用能力榜147
58.0
AA
数学推理39
90.0
AA
科学能力183
49.0
AA

基准测试分数 (LLM Stats)

Biology

GPQA71.5%自报

Communication

TAU-bench Retail54.8%自报

Finance

MMLU85.3%自报

Healthcare

HealthBench42.5%自报
HealthBench Hard10.8%自报

Math

CodeForces0.74 / 3000自报
Humanity's Last Exam10.9%自报

AA 评测指数

Math Index
89.3
Intelligence Index
24.5
Coding Index
18.5
Aime 25
0.9
Livecodebench
0.8
Mmlu Pro
0.7
Gpqa
0.7
Ifbench
0.7
Tau2
0.6
Scicode
0.3
Lcr
0.3
Terminalbench Hard
0.1
Hle
0.1

LLM Stats 分类评分

Finance
90
Language
90
Legal
90
General
80
Biology
70
Chemistry
70
Physics
70
Math
60
Reasoning
60
Tool Calling
50
Communication
50
Healthcare
50
Vision
10

定价

输入价格$0.05 / 1M tokens
输出价格$0.2 / 1M tokens
混合价格(3:1)$0.088 / 1M tokens

速度

Tokens/秒282.4 tokens/s
首Token延迟0.36s
首回答延迟7.44s

可用提供商

(LS 内部计价单位)
提供商输入价格输出价格
OpenAI100K500K
Fireworks100K500K
Groq100K500K

外部链接