GPT-4.1

OpenAIGPTProprietary

Description

GPT-4.1 is OpenAI's latest and most advanced flagship model, significantly improving upon GPT-4 Turbo in performance across benchmarks, speed, and cost-effectiveness.

Release Date

2025-04-14

Parameters

—

Context Length

1.0M

Modalities

image, pdf, text

Capability Radar

general

coding

reasoning

scienceest.

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Code Ranking	177	51.0	AA
General Ranking	206	48.0	AA
Math Reasoning	188	48.0	AA
Multimodal Ranking	58	74.0	LS
Reasoning	67	60.0	LS
Science	227	46.0	AA

Benchmark Scores (LLM Stats)

Biology

GPQA

66.3%SR

Code

SWE-Bench Verified

54.6%SR

Aider-Polyglot Edit

52.9%SR

Aider-Polyglot

51.6%SR

Communication

Multi-IF

70.8%SR

TAU-bench Retail

68.0%SR

TAU-bench Airline

49.4%SR

Multi-Challenge

38.3%SR

Finance

MMLU

90.2%SR

General

IFEval

87.4%SR

MMMLU

87.3%SR

MMMU

74.8%SR

Internal API instruction following (hard)

49.1%SR

Language

COLLIE

65.8%SR

Long Context

ComplexFuncBench

65.5%SR

OpenAI-MRCR: 2 needle 128k

57.2%SR

OpenAI-MRCR: 2 needle 1M

46.3%SR

Graphwalks parents >128k

25.0%SR

Graphwalks BFS >128k

19.0%SR

Math

MathVista

72.2%SR

AIME 2024

48.1%SR

AIME 2025

46.4%SR

HMMT 2025

28.9%SR

Humanity's Last Exam

5.4%SR

Multimodal

CharXiv-D

87.9%SR

Video-MME (long, no subtitles)

72.0%SR

CharXiv-R

56.7%SR

Reasoning

Graphwalks BFS <128k

61.7%SR

Graphwalks parents <128k

58.0%SR

AA Evaluation Indices

Math Index

34.7

Intelligence Index

19.4

Math 500

0.9

Mmlu Pro

0.8

Gpqa

0.7

Lcr

0.6

Tau2

0.5

Livecodebench

0.5

Aime

0.4

Ifbench

0.4

Scicode

0.4

Aime 25

0.3

Terminalbench Hard

0.1

Hle

0.0

LLM Stats Category Scores

Legal

Finance

Instruction Following

Language

Healthcare

Multimodal

Physics

Structured Output

General

Biology

Chemistry

Writing

Reasoning

Communication

Tool Calling

Vision

Math

Frontend Development

Code

Long Context

Spatial Reasoning

Pricing

Input Price$2 / 1M tokens

Output Price$8 / 1M tokens

Blended Price (3:1)$3.5 / 1M tokens

Cache Read Price$0.5 / 1M tokens

Speed

Tokens/sec146.3

Time to First Token0.59s

Time to Answer0.59s

Provider Price Ranking

20 providers

Cheapest: OpenAIMost Expensive: Cortecs

ProviderInputOutput

1OpenAICheapest

$0.00001

2Poe

$1.8

$7.2

3302.AI

4NanoGPT

5Abacus

6OpenRouter

7Kilo Gateway

8SAP AI Core

9GitHub Copilot

10Helicone

11Azure Cognitive Services

12Requesty

13Vercel AI Gateway

14LLM Gateway

15Azure

16FastRouter

17NEAR AI Cloud

18OrcaRouter

19Merge Gateway

20Cortecs

$2.354

$9.417

Compare pricing across different API providers for this model.

External Sources

LLM Stats Artificial Analysis