Claude Sonnet 5 (Adaptive Reasoning, Xhigh Effort)

AnthropicClaude

説明

Claude Sonnet 5 is Anthropic's most agentic Sonnet-class model, an upgrade to Sonnet 4.6 that narrows the gap to Opus 4.8 on reasoning, tool use, coding, computer use, and knowledge work while staying lower priced. It plans, uses tools like browsers and terminals, and runs autonomously for long-horizon tasks. Capability gains include SWE-Bench Verified (85.2%), SWE-Bench Pro (63.2%), SWE-Bench Multilingual (78.3%), Terminal-Bench 2.1 (80.4%), OSWorld-Verified (81.2%), BrowseComp (84.7% single-agent, 86.6% multi-agent), Humanity's Last Exam with tools (57.4%), USAMO 2026 (79.5%), GDPval-AA v2 (1618 Elo), HealthBench Professional (57.8%), and FrontierCode v1 (38.8%). It supports adaptive thinking with selectable effort levels up to 'extra high' (xhigh) and a 1M-token context window with context compaction. The safety assessment found lower rates of misaligned behavior, hallucination, and sycophancy than Sonnet 4.6, with improved prompt-injection robustness; it ships with cyber safeguards enabled by default and uses an updated tokenizer (input maps to roughly 1.0-1.35x more tokens than Sonnet 4.6). Default model on Free and Pro plans and available to Max, Team, and Enterprise users, in Claude Code, and on the Claude Platform. Launches with introductory pricing of $2/$10 per million input/output tokens through August 31, 2026, then $3/$15. Available via the Claude API as `claude-sonnet-5`.

リリース日

2026-06-30

パラメータ

—

コンテキスト長

1.0M

モダリティ

image, pdf, text

能力レーダー

100

general

coding

reasoning

science推定

agents

multimodal

専門的な科学ベンチマークが利用できない場合、Scienceは推論プロキシを使用して推定します。

ベンチマークスコア (LLM Stats)

Agents

GDPval-AA

1618.00 / 3000自己申告

BrowseComp

84.7%自己申告

OSWorld-Verified

81.2%自己申告

Terminal-Bench 2.0

80.4%自己申告

SWE-Bench Pro

63.2%自己申告

OfficeQA Pro

59.4%自己申告

Toolathlon

54.3%自己申告

FrontierCode

38.8%自己申告

SWE-Bench Multimodal

28.1%自己申告

AutomationBench

13.5%自己申告

Legal Agent Benchmark

5.8%自己申告

Code

SWE-Bench Verified

85.2%自己申告

SWE-bench Multilingual

78.3%自己申告

BenchCAD

37.3%自己申告

General

GDP.pdf

81.6%自己申告

Healthcare

HealthBench Professional

57.8%自己申告

Math

USAMO 2026

33.39 / 42自己申告

ArXivMath

72.2%自己申告

Humanity's Last Exam

57.4%自己申告

Multimodal

CharXiv-R

88.3%自己申告

ChartMuseum

86.7%自己申告

AA評価指数

AA評価データがありません

LLM Statsカテゴリスコア

Finance

100

Legal

100

General

100

Agents

100

Reasoning

100

Frontend Development

Multimodal

Code

Tool Calling

Math

Healthcare

Vision

価格設定

入力価格$3 / 1Mトークン

出力価格$15 / 1Mトークン

混合価格（3:1）$6 / 1Mトークン

キャッシュ読み取り価格$0.2 / 1Mトークン

キャッシュ書き込み価格$2.5 / 1Mトークン

速度

トークン/秒76.3

初トークン遅延9.68s

初回答遅延9.68s

プロバイダー価格ランキング

1 プロバイダー

プロバイダー入力出力

1Anthropicプライマリ

$15

このモデルの異なるAPIプロバイダー間の価格を比較。

外部リンク

Artificial Analysis