GPT-5 (high)
OpenAIGPTProprietary
Описание
GPT-5 is a flagship model from OpenAI designed for coding, reasoning, and agentic tasks across domains. It is optimized for coding and agentic tasks with higher reasoning capabilities and medium speed.
Дата выхода
2025-08-07
Параметры
—
Длина контекста
400K
Модальности
file, image, text
Радар способностей
55
general
54
coding
95
reasoning
59
scienceоцен.
80
agents
90
multimodal
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Agents & Tools | 57 | 55.0 | LS |
| Code Ranking | 47 | 74.0 | AA |
| General Ranking | 34 | 83.0 | AA |
| Math Reasoning | 6 | 97.0 | AA |
| Multimodal Ranking | 21 | 84.0 | LS |
| Reasoning | 41 | 72.0 | LS |
| Science | 41 | 74.0 | AA |
Оценки бенчмарков (LLM Stats)
Agents
BrowseComp
54.9%Сам.
Biology
GPQA
85.7%Сам.
Code
SWE-Lancer (IC-Diamond subset)
100.0%Сам.
HumanEval
93.4%Сам.
Aider-Polyglot
88.0%Сам.
SWE-Bench Verified
74.9%Сам.
Communication
Tau2 Telecom
96.7%Сам.
Tau2 Retail
81.1%Сам.
Multi-Challenge
69.6%Сам.
Tau2 Airline
62.6%Сам.
Finance
MMLU
92.5%Сам.
General
MMMU
84.2%Сам.
MMMU-Pro
78.4%Сам.
Internal API instruction following (hard)
64.0%Сам.
LongFact Objects
0.8%Сам.
LongFact Concepts
0.7%Сам.
Healthcare
VideoMMMU
84.6%Сам.
HealthBench Hard
1.6%Сам.
Language
COLLIE
99.0%Сам.
Long Context
OpenAI-MRCR: 2 needle 128k
95.2%Сам.
OpenAI-MRCR: 2 needle 256k
86.8%Сам.
Math
AIME 2025
94.6%Сам.
HMMT 2025
93.3%Сам.
MATH
84.7%Сам.
FrontierMath
26.3%Сам.
Humanity's Last Exam
24.8%Сам.
Multimodal
VideoMME w sub.
86.7%Сам.
CharXiv-R
81.1%Сам.
Reasoning
BrowseComp Long Context 128k
90.0%Сам.
BrowseComp Long Context 256k
88.8%Сам.
Graphwalks BFS <128k
78.3%Сам.
Graphwalks parents <128k
73.3%Сам.
ERQA
65.7%Сам.
FActScore
1.0%Сам.
Индексы оценки AA
Math Index94.3
Intelligence Index44.6
Coding Index36.0
Math 5001.0
Aime1.0
Aime 250.9
Mmlu Pro0.9
Gpqa0.9
Tau20.8
Livecodebench0.8
Lcr0.8
Ifbench0.7
Scicode0.4
Terminalbench Hard0.3
Hle0.3
Оценки категорий LLM Stats
Robotics20
Spatial Reasoning6
Multimodal4
Vision3
Reasoning2
Writing100
Language100
Long Context100
Video90
Biology90
Chemistry90
Code90
Finance90
Legal90
Physics90
Tool Calling80
Communication80
General80
Frontend Development70
Healthcare70
Math70
Search70
Structured Output60
Agents50
Цены
Цена ввода$1.25 / 1M tokens
Цена вывода$10 / 1M tokens
Смешанная цена (3:1)$3.438 / 1M tokens
Скорость
Токенов/сек95.3 tokens/s
Задержка первого токена98.86s
Время до первого ответа98.86s
Доступные провайдеры
(Внутренние единицы LS)Нет данных провайдеров