Claude Mythos Preview
Описание
Claude Mythos Preview is an unreleased general-purpose frontier model from Anthropic, a new tier above Opus (internal codename 'Capybara'). It identified thousands of zero-day vulnerabilities across every major operating system and web browser as part of Project Glasswing, a cross-industry cybersecurity initiative with 12 partners including AWS, Apple, Microsoft, and Google. State-of-the-art on SWE-bench Verified (93.9%), GPQA Diamond (94.6%), USAMO (97.6%), Terminal-Bench 2.0 (82.0%), CyberGym (83.1%), and Cybench (100% pass@1, saturated). Represents a 4.3x increase over the previous trendline for model performance. Deployed under ASL-3 Standard. Best-aligned Claude model to date per Anthropic's risk report, with the first-ever 24-hour internal alignment review before deployment. Not planned for general availability. Pricing for participants: $25/$125 per million tokens (input/output). 244-page system card.
Радар способностей
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Agents & Tools | 3 | 79.0 | LS |
| Multimodal Ranking | 3 | 93.0 | LS |
Оценки бенчмарков (LLM Stats)
Agents
Biology
Code
General
Healthcare
Long Context
Math
Multimodal
Индексы оценки AA
Нет данных AA оценки
Оценки категорий LLM Stats
Цены
Нет данных о ценах
Скорость
Нет данных о скорости
Доступные провайдеры
(Внутренние единицы LS)Нет данных провайдеров