o1

OpenAIOpenAI o-seriesProprietary

描述

A research preview model focused on mathematical and logical reasoning capabilities, demonstrating improved performance on tasks requiring step-by-step reasoning, mathematical problem-solving, and code generation. The model shows enhanced capabilities in formal reasoning while maintaining strong general capabilities.

發布日期

2024-12-05

參數規模

—

上下文長度

200K

支援模態

image, pdf, text

能力雷達圖

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
程式碼能力榜	151	55.0	AA
通用能力榜	105	63.0	AA
數學推理	55	87.0	AA
科學能力	195	49.0	AA

基準測試分數 (LLM Stats)

Biology

GPQA

78.0%自報

GPQA Biology

69.2%自報

Chemistry

GPQA Chemistry

64.7%自報

Code

HumanEval

88.1%自報

SWE-Bench Verified

41.0%自報

Communication

TAU-bench Retail

70.8%自報

TAU-bench Airline

50.0%自報

Factuality

SimpleQA

47.0%自報

Finance

MMLU

91.8%自報

General

MMMLU

87.7%自報

MMMU

77.6%自報

LiveBench

67.0%自報

Math

GSM8k

97.1%自報

MATH

96.4%自報

MGSM

89.3%自報

AIME 2024

74.3%自報

MathVista

71.8%自報

FrontierMath

5.5%自報

Physics

GPQA Physics

92.8%自報

AA 評測指數

Coding Index

39.7

Intelligence Index

23.4

Math 500

1.0

Mmlu Pro

0.8

Gpqa

0.7

Aime

0.7

Ifbench

0.7

Livecodebench

0.7

Tau2

0.6

Lcr

0.6

Scicode

0.4

Terminalbench Hard

0.1

Hle

0.1

LLM Stats 分類評分

Language

Legal

Finance

Math

Physics

Healthcare

Biology

Chemistry

Multimodal

Reasoning

General

Vision

Code

Communication

Tool Calling

Factuality

Frontend Development

定價

輸入價格$15 / 1M tokens

輸出價格$60 / 1M tokens

混合價格(3:1)$26.25 / 1M tokens

快取讀取價格$7.5 / 1M tokens

速度

Tokens/秒147.9

首Token延遲13.04s

首回答延遲13.04s

供應商價格排行

13 個供應商

最便宜: Poe最貴: Merge Gateway

供應商輸入輸出

1Poe最便宜

$14

$54

2NanoGPT

$14.994

$59.993

3OpenAI主要

$15

$60

4OpenRouter

$15

$60

5Kilo Gateway

$15

$60

6Cloudflare AI Gateway

$15

$60

7Helicone

$15

$60

8Azure Cognitive Services

$15

$60

9DigitalOcean

$15

$60

10Vercel AI Gateway

$15

$60

11LLM Gateway

$15

$60

12Azure

$15

$60

13Merge Gateway

$15

$60

比較該模型在不同 API 供應商之間的定價。

外部連結

LLM Stats Artificial Analysis