GPT-4o (Aug '24)
OpenAIGPTProprietary
描述
GPT-4o ('o' for 'omni') is a multimodal AI model that accepts text, audio, image, and video inputs, and generates text, audio, and image outputs. It matches GPT-4 Turbo performance on text and code, with improvements in non-English languages, vision, and audio understanding.
發布日期
2024-08-06
參數規模
—
上下文長度
128K
支援模態
file, image, text
能力雷達圖
15
general
24
coding
40
reasoning
36
science估算
50
agents
90
multimodal
Science 在缺少專門科學評測時使用推理能力代理估算。
排行榜排名
基準測試分數 (LLM Stats)
Biology
GPQA
70.1%自報
Code
SWE-Bench Verified
33.2%自報
SWE-Lancer
32.6%自報
Aider-Polyglot
30.7%自報
Aider-Polyglot Edit
18.2%自報
SWE-Lancer (IC-Diamond subset)
12.4%自報
Communication
Tau2 Retail
63.4%自報
Multi-IF
60.9%自報
TAU-bench Retail
60.3%自報
Tau2 Airline
45.5%自報
TAU-bench Airline
42.8%自報
Multi-Challenge
40.3%自報
Tau2 Telecom
23.5%自報
Factuality
SimpleQA
38.2%自報
Finance
MMLU
85.7%自報
MMLU-Pro
74.7%自報
General
MMMLU
81.4%自報
IFEval
81.0%自報
MMMU
72.2%自報
MMMU-Pro
59.9%自報
Internal API instruction following (hard)
29.2%自報
Healthcare
VideoMMMU
61.2%自報
Image To Text
DocVQA
92.8%自報
Language
COLLIE
61.0%自報
Long Context
EgoSchema
72.2%自報
ComplexFuncBench
66.5%自報
OpenAI-MRCR: 2 needle 128k
31.9%自報
Math
MathVista
61.4%自報
AIME 2024
13.1%自報
Humanity's Last Exam
5.3%自報
Multimodal
AI2D
94.2%自報
ChartQA
85.7%自報
CharXiv-D
85.3%自報
CharXiv-R
58.8%自報
Reasoning
Graphwalks BFS <128k
41.7%自報
Graphwalks parents <128k
35.4%自報
ERQA
35.2%自報
Video
ActivityNet
61.9%自報
AA 評測指數
Intelligence Index18.6
Coding Index16.6
Math 5000.8
Gpqa0.5
Ifbench0.4
Lcr0.3
Scicode0.3
Livecodebench0.3
Tau20.3
Aime0.1
Terminalbench Hard0.1
Hle0.0
LLM Stats 分類評分
Image To Text90
Finance80
Legal80
Vision70
Biology70
Chemistry70
Healthcare70
Instruction Following70
Language70
Multimodal70
Physics70
Structured Output60
Writing60
General60
Long Context60
Tool Calling50
Communication50
Math50
Reasoning50
Spatial Reasoning40
Factuality40
Code30
Frontend Development30
定價
輸入價格$2.5 / 1M tokens
輸出價格$10 / 1M tokens
混合價格(3:1)$4.375 / 1M tokens
速度
Tokens/秒102.1 tokens/s
首Token延遲0.65s
首回答延遲0.65s
可用提供商
(LS 內部計價單位)| 提供商 | 輸入價格 | 輸出價格 |
|---|---|---|
| OpenAI | 2.5M | 10.0M |
| Azure | 2.5M | 10.0M |