Devstral 2
MistralMistral
Release Date
2025-12-09
Parameters
—
Context Length
262K
Modalities
text
Capability Radar
32
general
42
coding
40
reasoning
39
scienceest.
41
agents
0
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Code Ranking | 242 | 40.0 | AA |
| General Ranking | 268 | 39.0 | AA |
| Math Reasoning | 233 | 37.0 | AA |
| Science | 282 | 40.0 | AA |
Benchmark Scores (LLM Stats)
Biology
GPQA
71.2%SR
Code
LiveCodeBench
63.6%SR
Creativity
Arena Hard
58.3%SR
Finance
MMLU-Pro
78.0%SR
General
MMMU-Pro
60.0%SR
IFBench
48.0%SR
Language
COLLIE
62.9%SR
Long Context
AA-LCR
71.2%SR
Math
AIME 2025
83.8%SR
AA Evaluation Indices
Math Index36.7
Intelligence Index15.5
Mmlu Pro0.8
Gpqa0.6
Livecodebench0.4
Ifbench0.4
Aime 250.4
Scicode0.3
Lcr0.3
Tau20.2
Terminalbench Hard0.2
Hle0.0
LLM Stats Category Scores
Legal80
Math80
Finance80
Healthcare80
Language70
Long Context70
Physics70
Reasoning70
Biology70
Chemistry70
Multimodal60
General60
Code60
Creativity60
Vision60
Writing60
Instruction Following50
Pricing
Input PriceFree
Output PriceFree
Blended Price (3:1)Free
Speed
Tokens/sec70.3
Time to First Token0.71s
Time to Answer0.71s
Provider Price Ranking
Provider Price Ranking
9 providers
Cheapest: ScalewayMost Expensive: Merge Gateway
ProviderInputOutput
1ScalewayCheapest
$0.4
$2
2NanoGPT
$0.4
$1.4
3OpenRouter
$0.4
$2
4Kilo Gateway
$0.4
$2
5Amazon Bedrock
$0.4
$2
6Mistral
$0.4
$2
7Vercel AI Gateway
$0.4
$2
8LLM Gateway
$0.4
$2
9Merge Gateway
$0.4
$2
Compare pricing across different API providers for this model.