o3-mini

OpenAIOpenAI o-seriesProprietary

Descripción

A smaller variant of O3, expected to offer enhanced multimodal capabilities, improved reasoning, and more efficient resource utilization compared to previous models while maintaining strong performance on core tasks.

Fecha de lanzamiento

2025-01-31

Parámetros

—

Longitud del contexto

200K

Modalidades

text

Radar de capacidades

general

coding

reasoning

scienceest.

agents

multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio	#Posición	Puntuación	Fuente
Ranking de codificación	217	45.0	AA
Ranking general	234	45.0	AA
Razonamiento matemático	50	89.0	AA
Razonamiento	83	54.0	LS
Ciencia	168	52.0	AA

Puntuaciones de benchmarks (LLM Stats)

Biology

GPQA

77.2%Aut.

Code

Aider-Polyglot

66.7%Aut.

Aider-Polyglot Edit

60.4%Aut.

SWE-Bench Verified

49.3%Aut.

SWE-Lancer

18.0%Aut.

SWE-Lancer (IC-Diamond subset)

7.4%Aut.

Communication

Multi-IF

79.5%Aut.

TAU-bench Retail

57.6%Aut.

Multi-Challenge

39.9%Aut.

TAU-bench Airline

32.4%Aut.

Factuality

SimpleQA

15.0%Aut.

Finance

MMLU

86.9%Aut.

General

IFEval

93.9%Aut.

LiveBench

84.6%Aut.

Multilingual MMLU

80.7%Aut.

Internal API instruction following (hard)

50.0%Aut.

Language

COLLIE

98.7%Aut.

Long Context

OpenAI-MRCR: 2 needle 128k

18.7%Aut.

ComplexFuncBench

17.6%Aut.

Math

MATH

97.9%Aut.

MGSM

92.0%Aut.

AIME 2024

87.3%Aut.

FrontierMath

9.2%Aut.

Reasoning

Graphwalks parents <128k

58.3%Aut.

Graphwalks BFS <128k

51.0%Aut.

Índices de evaluación AA

Intelligence Index

19.0

Math 500

1.0

Mmlu Pro

0.8

Aime

0.8

Gpqa

0.7

Livecodebench

0.7

Scicode

0.4

Tau2

0.3

Hle

0.1

Terminalbench Hard

0.1

Puntuaciones por categoría LLM Stats

Writing

100

Instruction Following

Language

Legal

Finance

Healthcare

Math

Physics

Biology

Chemistry

General

Reasoning

Structured Output

Spatial Reasoning

Frontend Development

Communication

Code

Tool Calling

Long Context

Factuality

Precios

Precio de entrada$1.1 / 1M tokens

Precio de salida$4.4 / 1M tokens

Precio mixto (3:1)$1.925 / 1M tokens

Precio de lectura caché$0.55 / 1M tokens

Velocidad

Tokens/seg229.8

Retraso del primer token5.43s

Tiempo hasta la respuesta5.43s

Ranking de Precios por Proveedor

9 proveedores

Más barato: NanoGPTMás caro: Azure

ProveedorEntradaSalida

1NanoGPTMás barato

$1.088

$4.3996

2OpenAIPRINCIPAL

$1.1

$4.4

3Abacus

$1.1

$4.4

4Jiekou.AI

$1.1

$4.4

5Helicone

$1.1

$4.4

6Azure Cognitive Services

$1.1

$4.4

7DigitalOcean

$1.1

$4.4

8LLM Gateway

$1.1

$4.4

9Azure

$1.1

$4.4

Comparar precios entre diferentes proveedores de API para este modelo.

Fuentes externas

LLM Stats Artificial Analysis