Mercury 2
InceptionProprietary
Description
Mercury 2 is the fastest reasoning LLM, built on diffusion-based language model (dLLM) architecture. Instead of generating text token-by-token, it refines multiple text blocks simultaneously, achieving over 1,000 tokens per second on Nvidia Blackwell GPUs — 5x faster than leading speed-optimized LLMs. Supports tool usage and JSON output with 128K context window.
Release Date
2026-02-20
Parameters
—
Context Length
128K
Modalities
text
Capability Radar
23
general
39
coding
77
reasoning
51
scienceest.
50
agents
0
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Code Ranking | 220 | 45.0 | AA |
| General Ranking | 132 | 59.0 | AA |
| Science | 124 | 57.0 | AA |
Benchmark Scores (LLM Stats)
Biology
GPQA
74.0%SR
SciCode
38.0%SR
Code
LiveCodeBench
67.0%SR
Communication
Tau2 Airline
53.0%SR
General
IFBench
71.0%SR
Math
AIME 2025
91.1%SR
AA Evaluation Indices
Intelligence Index25.3
Gpqa0.8
Tau20.7
Ifbench0.7
Scicode0.4
Lcr0.4
Terminalbench Hard0.3
Hle0.2
LLM Stats Category Scores
Instruction Following70
General70
Math60
Physics60
Reasoning60
Biology60
Chemistry60
Code50
Communication50
Tool Calling50
Pricing
Input Price$0.25 / 1M tokens
Output Price$0.75 / 1M tokens
Blended Price (3:1)$0.375 / 1M tokens
Cache Read Price$0.025 / 1M tokens
Speed
Tokens/sec1239.8
Time to First Token3.43s
Time to Answer3.43s
Provider Price Ranking
Provider Price Ranking
6 providers
Cheapest: InceptionMost Expensive: Venice AI
ProviderInputOutput
1InceptionCheapest
$0
$0
2NanoGPT
$0.25
$0.75
3OpenRouter
$0.25
$0.75
4Kilo Gateway
$0.25
$0.75
5Vercel AI Gateway
$0.25
$0.75
6Venice AI
$0.3125
$0.9375
Compare pricing across different API providers for this model.