Grok 4

xAIGrokProprietary

説明

Grok 4, announced by xAI in summer 2025, represents a major leap in AI capabilities, described as 'the smartest AI in the world.' Built on version 6 of xAI's foundation model, it uses 100x more training compute than Grok 2 and 10x more reinforcement learning compute than Grok 3. The model achieves PhD-level performance across all academic disciplines simultaneously, scoring perfect on standardized tests like the SAT and near-perfect on graduate exams like the GRE. Unlike Grok 3, tool usage is built into the training process rather than relying on generalization. Trained using 200,000 GPUs, Grok 4 excels at complex reasoning, mathematical problem-solving, and coding tasks, though it has acknowledged weaknesses in multimodal capabilities that are being addressed in the next version.

リリース日

2025-07-10

パラメータ

—

コンテキスト長

—

モダリティ

image, text

能力レーダー

general

coding

reasoning

science推定

agents

multimodal

専門的な科学ベンチマークが利用できない場合、Scienceは推論プロキシを使用して推定します。

ベンチマークスコア (LLM Stats)

Biology

GPQA

87.5%自己申告

Code

LiveCodeBench

79.0%自己申告

Math

AIME 2025

91.7%自己申告

HMMT25

90.0%自己申告

Humanity's Last Exam

40.0%自己申告

USAMO25

37.5%自己申告

Reasoning

ARC-AGI v2

15.9%自己申告

AA評価指数

Math Index

92.7

Intelligence Index

33.3

Math 500

1.0

Aime

0.9

Aime 25

0.9

Gpqa

0.9

Mmlu Pro

0.9

Livecodebench

0.8

Tau2

0.7

Lcr

0.7

Ifbench

0.5

Scicode

0.5

Terminalbench Hard

0.4

Hle

0.2

LLM Statsカテゴリスコア

Physics

Biology

Chemistry

General

Code

Math

Reasoning

Vision

Spatial Reasoning

価格設定

入力価格$5.5 / 1Mトークン

出力価格$27.5 / 1Mトークン

混合価格（3:1）$11 / 1Mトークン

速度

トークン/秒0.0

初トークン遅延0.00s

初回答遅延0.00s

プロバイダー価格ランキング

6 プロバイダー

最安: ZenMux最高: xAI

プロバイダー入力出力

1ZenMux最安

$15

2Poe

$15

3Helicone

$15

4Requesty

$15

5FastRouter

$15

6xAIプライマリ

$5.5

$27.5

このモデルの異なるAPIプロバイダー間の価格を比較。

外部リンク

LLM Stats Artificial Analysis

ドメイン	#順位	スコア	ソース
コーディングランキング	31	80.0	AA
総合ランキング	88	68.0	AA
数学的推論	11	96.0	AA
推論	108	16.0	LS
科学	51	71.0	AA