Claude Sonnet 5 (Adaptive Reasoning, Max Effort)

AnthropicClaude

Descripción

Claude Sonnet 5 is Anthropic's most agentic Sonnet-class model, an upgrade to Sonnet 4.6 that narrows the gap to Opus 4.8 on reasoning, tool use, coding, computer use, and knowledge work while staying lower priced. It plans, uses tools like browsers and terminals, and runs autonomously for long-horizon tasks. Capability gains include SWE-Bench Verified (85.2%), SWE-Bench Pro (63.2%), SWE-Bench Multilingual (78.3%), Terminal-Bench 2.1 (80.4%), OSWorld-Verified (81.2%), BrowseComp (84.7% single-agent, 86.6% multi-agent), Humanity's Last Exam with tools (57.4%), USAMO 2026 (79.5%), GDPval-AA v2 (1618 Elo), HealthBench Professional (57.8%), and FrontierCode v1 (38.8%). It supports adaptive thinking with selectable effort levels up to 'extra high' (xhigh) and a 1M-token context window with context compaction. The safety assessment found lower rates of misaligned behavior, hallucination, and sycophancy than Sonnet 4.6, with improved prompt-injection robustness; it ships with cyber safeguards enabled by default and uses an updated tokenizer (input maps to roughly 1.0-1.35x more tokens than Sonnet 4.6). Default model on Free and Pro plans and available to Max, Team, and Enterprise users, in Claude Code, and on the Claude Platform. Launches with introductory pricing of $2/$10 per million input/output tokens through August 31, 2026, then $3/$15. Available via the Claude API as `claude-sonnet-5`.

Fecha de lanzamiento

2026-06-30

Parámetros

—

Longitud del contexto

1.0M

Modalidades

image, pdf, text

Radar de capacidades

general

coding

reasoning

scienceest.

agents

multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio	#Posición	Puntuación	Fuente
Ranking de codificación	7	93.0	AA
Ranking general	5	89.0	AA
Ciencia	13	86.0	AA

Puntuaciones de benchmarks (LLM Stats)

Agents

GDPval-AA

1618.00 / 3000Aut.

BrowseComp

84.7%Aut.

OSWorld-Verified

81.2%Aut.

Terminal-Bench 2.0

80.4%Aut.

SWE-Bench Pro

63.2%Aut.

OfficeQA Pro

59.4%Aut.

Toolathlon

54.3%Aut.

FrontierCode

38.8%Aut.

SWE-Bench Multimodal

28.1%Aut.

AutomationBench

13.5%Aut.

Legal Agent Benchmark

5.8%Aut.

Code

SWE-Bench Verified

85.2%Aut.

SWE-bench Multilingual

78.3%Aut.

BenchCAD

37.3%Aut.

General

GDP.pdf

81.6%Aut.

Healthcare

HealthBench Professional

57.8%Aut.

Math

USAMO 2026

33.39 / 42Aut.

ArXivMath

72.2%Aut.

Humanity's Last Exam

57.4%Aut.

Multimodal

CharXiv-R

88.3%Aut.

ChartMuseum

86.7%Aut.

Índices de evaluación AA

Coding Index

71.5

Intelligence Index

53.4

Gpqa

0.9

Terminalbench V2 1

0.8

Lcr

0.7

Scicode

0.5

Hle

0.4

Tau Banking

0.3

Puntuaciones por categoría LLM Stats

Finance

100

Legal

100

General

100

Agents

100

Reasoning

100

Frontend Development

Multimodal

Code

Tool Calling

Math

Healthcare

Vision

Precios

Precio de entradaGratis

Precio de salidaGratis

Precio mixto (3:1)Gratis

Precio de lectura caché$0.2 / 1M tokens

Precio de escritura caché$2.5 / 1M tokens

Velocidad

Tokens/seg0.0

Retraso del primer token0.00s

Tiempo hasta la respuesta0.00s

Ranking de Precios por Proveedor

9 proveedores

Más barato: Amazon BedrockMás caro: Cortecs

ProveedorEntradaSalida

1Amazon BedrockMás barato

$10

2Vertex (Anthropic)

$10

3Vertex

$10

4Poe

$2.6

$13

5NanoGPT

$2.992

$14.994

6OpenRouter

$15

7Kilo Gateway

$15

8DigitalOcean

$15

9Cortecs

$3.59

$17.92

Comparar precios entre diferentes proveedores de API para este modelo.

Fuentes externas

Artificial Analysis