o3-mini

OpenAIOpenAI o-seriesProprietary

描述

A smaller variant of O3, expected to offer enhanced multimodal capabilities, improved reasoning, and more efficient resource utilization compared to previous models while maintaining strong performance on core tasks.

发布日期

2025-01-31

参数规模

—

上下文长度

200K

支持模态

text

能力雷达图

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域	#排名	分数	来源
代码能力榜	217	45.0	AA
通用能力榜	234	45.0	AA
数学推理	50	89.0	AA
推理能力	83	54.0	LS
科学能力	168	52.0	AA

基准测试分数 (LLM Stats)

Biology

GPQA

77.2%自报

Code

Aider-Polyglot

66.7%自报

Aider-Polyglot Edit

60.4%自报

SWE-Bench Verified

49.3%自报

SWE-Lancer

18.0%自报

SWE-Lancer (IC-Diamond subset)

7.4%自报

Communication

Multi-IF

79.5%自报

TAU-bench Retail

57.6%自报

Multi-Challenge

39.9%自报

TAU-bench Airline

32.4%自报

Factuality

SimpleQA

15.0%自报

Finance

MMLU

86.9%自报

General

IFEval

93.9%自报

LiveBench

84.6%自报

Multilingual MMLU

80.7%自报

Internal API instruction following (hard)

50.0%自报

Language

COLLIE

98.7%自报

Long Context

OpenAI-MRCR: 2 needle 128k

18.7%自报

ComplexFuncBench

17.6%自报

Math

MATH

97.9%自报

MGSM

92.0%自报

AIME 2024

87.3%自报

FrontierMath

9.2%自报

Reasoning

Graphwalks parents <128k

58.3%自报

Graphwalks BFS <128k

51.0%自报

AA 评测指数

Intelligence Index

19.0

Math 500

1.0

Mmlu Pro

0.8

Aime

0.8

Gpqa

0.7

Livecodebench

0.7

Scicode

0.4

Tau2

0.3

Hle

0.1

Terminalbench Hard

0.1

LLM Stats 分类评分

Writing

100

Instruction Following

Language

Legal

Finance

Healthcare

Math

Physics

Biology

Chemistry

General

Reasoning

Structured Output

Spatial Reasoning

Frontend Development

Communication

Code

Tool Calling

Long Context

Factuality

定价

输入价格$1.1 / 1M tokens

输出价格$4.4 / 1M tokens

混合价格(3:1)$1.925 / 1M tokens

缓存读取价格$0.55 / 1M tokens

速度

Tokens/秒229.8

首Token延迟5.43s

首回答延迟5.43s

供应商价格排行

9 个供应商

最便宜: NanoGPT最贵: Azure

供应商输入输出

1NanoGPT最便宜

$1.088

$4.3996

2OpenAI主要

$1.1

$4.4

3Abacus

$1.1

$4.4

4Jiekou.AI

$1.1

$4.4

5Helicone

$1.1

$4.4

6Azure Cognitive Services

$1.1

$4.4

7DigitalOcean

$1.1

$4.4

8LLM Gateway

$1.1

$4.4

9Azure

$1.1

$4.4

比较该模型在不同 API 供应商之间的定价。

外部链接

LLM Stats Artificial Analysis