Kimi K2 Thinking
Описание
Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, it is built as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context window, achieving lossless reductions in inference latency and GPU memory usage. Key features include deep thinking & tool orchestration with end-to-end training to interleave chain-of-thought reasoning with function calls, native INT4 quantization via Quantization-Aware Training (QAT) achieving lossless 2x speed-up, and stable long-horizon agency maintaining coherent goal-directed behavior across up to 200–300 consecutive tool invocations.
Радар способностей
Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.
Рейтинги
| Домен | #Место | Оценка | Источник |
|---|---|---|---|
| Agents & Tools | 63 | 54.0 | LS |
| Code Ranking | 60 | 70.0 | AA |
| General Ranking | 47 | 79.0 | AA |
| Math Reasoning | 12 | 96.0 | AA |
| Reasoning | 56 | 66.0 | LS |
| Science | 56 | 70.0 | AA |
Оценки бенчмарков (LLM Stats)
Agents
Biology
Code
Communication
Economics
Finance
General
Healthcare
Math
Reasoning
Индексы оценки AA
Оценки категорий LLM Stats
Цены
Скорость
Доступные провайдеры
(Внутренние единицы LS)Нет данных провайдеров