o3-mini

OpenAIOpenAI o-seriesProprietary

描述

A smaller variant of O3, expected to offer enhanced multimodal capabilities, improved reasoning, and more efficient resource utilization compared to previous models while maintaining strong performance on core tasks.

發布日期

2025-01-31

參數規模

—

上下文長度

200K

支援模態

text

能力雷達圖

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
程式碼能力榜	217	45.0	AA
通用能力榜	234	45.0	AA
數學推理	50	89.0	AA
推理能力	83	54.0	LS
科學能力	168	52.0	AA

基準測試分數 (LLM Stats)

Biology

GPQA

77.2%自報

Code

Aider-Polyglot

66.7%自報

Aider-Polyglot Edit

60.4%自報

SWE-Bench Verified

49.3%自報

SWE-Lancer

18.0%自報

SWE-Lancer (IC-Diamond subset)

7.4%自報

Communication

Multi-IF

79.5%自報

TAU-bench Retail

57.6%自報

Multi-Challenge

39.9%自報

TAU-bench Airline

32.4%自報

Factuality

SimpleQA

15.0%自報

Finance

MMLU

86.9%自報

General

IFEval

93.9%自報

LiveBench

84.6%自報

Multilingual MMLU

80.7%自報

Internal API instruction following (hard)

50.0%自報

Language

COLLIE

98.7%自報

Long Context

OpenAI-MRCR: 2 needle 128k

18.7%自報

ComplexFuncBench

17.6%自報

Math

MATH

97.9%自報

MGSM

92.0%自報

AIME 2024

87.3%自報

FrontierMath

9.2%自報

Reasoning

Graphwalks parents <128k

58.3%自報

Graphwalks BFS <128k

51.0%自報

AA 評測指數

Intelligence Index

19.0

Math 500

1.0

Mmlu Pro

0.8

Aime

0.8

Gpqa

0.7

Livecodebench

0.7

Scicode

0.4

Tau2

0.3

Hle

0.1

Terminalbench Hard

0.1

LLM Stats 分類評分

Writing

100

Instruction Following

Language

Legal

Finance

Healthcare

Math

Physics

Biology

Chemistry

General

Reasoning

Structured Output

Spatial Reasoning

Frontend Development

Communication

Code

Tool Calling

Long Context

Factuality

定價

輸入價格$1.1 / 1M tokens

輸出價格$4.4 / 1M tokens

混合價格(3:1)$1.925 / 1M tokens

快取讀取價格$0.55 / 1M tokens

速度

Tokens/秒229.8

首Token延遲5.43s

首回答延遲5.43s

供應商價格排行

9 個供應商

最便宜: NanoGPT最貴: Azure

供應商輸入輸出

1NanoGPT最便宜

$1.088

$4.3996

2OpenAI主要

$1.1

$4.4

3Abacus

$1.1

$4.4

4Jiekou.AI

$1.1

$4.4

5Helicone

$1.1

$4.4

6Azure Cognitive Services

$1.1

$4.4

7DigitalOcean

$1.1

$4.4

8LLM Gateway

$1.1

$4.4

9Azure

$1.1

$4.4

比較該模型在不同 API 供應商之間的定價。

外部連結

LLM Stats Artificial Analysis