Kimi K2 Thinking
Description
Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, it is built as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context window, achieving lossless reductions in inference latency and GPU memory usage. Key features include deep thinking & tool orchestration with end-to-end training to interleave chain-of-thought reasoning with function calls, native INT4 quantization via Quantization-Aware Training (QAT) achieving lossless 2x speed-up, and stable long-horizon agency maintaining coherent goal-directed behavior across up to 200–300 consecutive tool invocations.
Radar de capacités
Science utilise un proxy de raisonnement lorsque les benchmarks scientifiques dédiés ne sont pas disponibles.
Classements
| Domaine | #Rang | Score | Source |
|---|---|---|---|
| Agents & Tools | 63 | 54.0 | LS |
| Code Ranking | 60 | 70.0 | AA |
| General Ranking | 47 | 79.0 | AA |
| Math Reasoning | 12 | 96.0 | AA |
| Reasoning | 56 | 66.0 | LS |
| Science | 56 | 70.0 | AA |
Scores de benchmarks (LLM Stats)
Agents
Biology
Code
Communication
Economics
Finance
General
Healthcare
Math
Reasoning
Indices d'évaluation AA
Scores par catégorie LLM Stats
Tarification
Vitesse
Fournisseurs disponibles
(Unités internes LS)Aucune donnée de fournisseur disponible