Claude Mythos Preview
Description
Claude Mythos Preview is an unreleased general-purpose frontier model from Anthropic, a new tier above Opus (internal codename 'Capybara'). It identified thousands of zero-day vulnerabilities across every major operating system and web browser as part of Project Glasswing, a cross-industry cybersecurity initiative with 12 partners including AWS, Apple, Microsoft, and Google. State-of-the-art on SWE-bench Verified (93.9%), GPQA Diamond (94.6%), USAMO (97.6%), Terminal-Bench 2.0 (82.0%), CyberGym (83.1%), and Cybench (100% pass@1, saturated). Represents a 4.3x increase over the previous trendline for model performance. Deployed under ASL-3 Standard. Best-aligned Claude model to date per Anthropic's risk report, with the first-ever 24-hour internal alignment review before deployment. Not planned for general availability. Pricing for participants: $25/$125 per million tokens (input/output). 244-page system card.
Radar de capacités
Science utilise un proxy de raisonnement lorsque les benchmarks scientifiques dédiés ne sont pas disponibles.
Classements
| Domaine | #Rang | Score | Source |
|---|---|---|---|
| Agents & Tools | 3 | 79.0 | LS |
| Multimodal Ranking | 3 | 93.0 | LS |
Scores de benchmarks (LLM Stats)
Agents
Biology
Code
General
Healthcare
Long Context
Math
Multimodal
Indices d'évaluation AA
Aucune donnée d'évaluation AA disponible
Scores par catégorie LLM Stats
Tarification
Aucune donnée de prix disponible
Vitesse
Aucune donnée de vitesse disponible
Fournisseurs disponibles
(Unités internes LS)Aucune donnée de fournisseur disponible