Saltar al contenido principal

Claude Sonnet 5 (Adaptive Reasoning, Max Effort)

AnthropicClaude

Descripción

Claude Sonnet 5 is Anthropic's most agentic Sonnet-class model, an upgrade to Sonnet 4.6 that narrows the gap to Opus 4.8 on reasoning, tool use, coding, computer use, and knowledge work while staying lower priced. It plans, uses tools like browsers and terminals, and runs autonomously for long-horizon tasks. Capability gains include SWE-Bench Verified (85.2%), SWE-Bench Pro (63.2%), SWE-Bench Multilingual (78.3%), Terminal-Bench 2.1 (80.4%), OSWorld-Verified (81.2%), BrowseComp (84.7% single-agent, 86.6% multi-agent), Humanity's Last Exam with tools (57.4%), USAMO 2026 (79.5%), GDPval-AA v2 (1618 Elo), HealthBench Professional (57.8%), and FrontierCode v1 (38.8%). It supports adaptive thinking with selectable effort levels up to 'extra high' (xhigh) and a 1M-token context window with context compaction. The safety assessment found lower rates of misaligned behavior, hallucination, and sycophancy than Sonnet 4.6, with improved prompt-injection robustness; it ships with cyber safeguards enabled by default and uses an updated tokenizer (input maps to roughly 1.0-1.35x more tokens than Sonnet 4.6). Default model on Free and Pro plans and available to Max, Team, and Enterprise users, in Claude Code, and on the Claude Platform. Launches with introductory pricing of $2/$10 per million input/output tokens through August 31, 2026, then $3/$15. Available via the Claude API as `claude-sonnet-5`.

Fecha de lanzamiento
2026-06-30
Parámetros
Longitud del contexto
1.0M
Modalidades
image, pdf, text

Radar de capacidades

50
general
69
coding
91
reasoning
68
scienceest.
70
agents
70
multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio#PosiciónPuntuaciónFuente
Ranking de codificación7
93.0
AA
Ranking general5
89.0
AA
Ciencia13
86.0
AA

Puntuaciones de benchmarks (LLM Stats)

Agents

GDPval-AA1618.00 / 3000Aut.
BrowseComp84.7%Aut.
OSWorld-Verified81.2%Aut.
Terminal-Bench 2.080.4%Aut.
SWE-Bench Pro63.2%Aut.
OfficeQA Pro59.4%Aut.
Toolathlon54.3%Aut.
FrontierCode38.8%Aut.
SWE-Bench Multimodal28.1%Aut.
AutomationBench13.5%Aut.
Legal Agent Benchmark5.8%Aut.

Code

SWE-Bench Verified85.2%Aut.
SWE-bench Multilingual78.3%Aut.
BenchCAD37.3%Aut.

General

GDP.pdf81.6%Aut.

Healthcare

HealthBench Professional57.8%Aut.

Math

USAMO 202633.39 / 42Aut.
ArXivMath72.2%Aut.
Humanity's Last Exam57.4%Aut.

Multimodal

CharXiv-R88.3%Aut.
ChartMuseum86.7%Aut.

Índices de evaluación AA

Coding Index
71.5
Intelligence Index
53.4
Gpqa
0.9
Terminalbench V2 1
0.8
Lcr
0.7
Scicode
0.5
Hle
0.4
Tau Banking
0.3

Puntuaciones por categoría LLM Stats

Finance
100
Legal
100
General
100
Agents
100
Reasoning
100
Frontend Development
90
Search
80
Multimodal
70
Code
70
Tool Calling
70
Math
60
Healthcare
60
Vision
60

Precios

Precio de entradaGratis
Precio de salidaGratis
Precio mixto (3:1)Gratis
Precio de lectura caché$0.2 / 1M tokens
Precio de escritura caché$2.5 / 1M tokens

Velocidad

Tokens/seg0.0
Retraso del primer token0.00s
Tiempo hasta la respuesta0.00s

Ranking de Precios por Proveedor

Ranking de Precios por Proveedor

9 proveedores

Más barato: Amazon BedrockMás caro: Cortecs
ProveedorEntradaSalida
1Amazon BedrockMás barato
$2
$10
2Vertex (Anthropic)
$2
$10
3Vertex
$2
$10
4Poe
$2.6
$13
5NanoGPT
$2.992
$14.994
6OpenRouter
$3
$15
7Kilo Gateway
$3
$15
8DigitalOcean
$3
$15
9Cortecs
$3.59
$17.92

Comparar precios entre diferentes proveedores de API para este modelo.

Fuentes externas