Mercury 2
InceptionProprietary
Description
Mercury 2 is the fastest reasoning LLM, built on diffusion-based language model (dLLM) architecture. Instead of generating text token-by-token, it refines multiple text blocks simultaneously, achieving over 1,000 tokens per second on Nvidia Blackwell GPUs — 5x faster than leading speed-optimized LLMs. Supports tool usage and JSON output with 128K context window.
Release Date
2026-02-20
Parameters
—
Context Length
128K
Modalities
text
Capability Radar
29
general
32
coding
77
reasoning
51
scienceest.
50
agents
0
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Code Ranking | 155 | 49.0 | AA |
| General Ranking | 115 | 64.0 | AA |
| Science | 106 | 60.0 | AA |
Benchmark Scores (LLM Stats)
Biology
GPQA
74.0%SR
SciCode
38.0%SR
Code
LiveCodeBench
67.0%SR
Communication
Tau2 Airline
53.0%SR
General
IFBench
71.0%SR
Math
AIME 2025
91.1%SR
AA Evaluation Indices
Intelligence Index32.8
Coding Index30.6
Gpqa0.8
Tau20.7
Ifbench0.7
Scicode0.4
Lcr0.4
Terminalbench Hard0.3
Hle0.2
LLM Stats Category Scores
General70
Instruction Following70
Biology60
Chemistry60
Math60
Physics60
Reasoning60
Tool Calling50
Code50
Communication50
Pricing
Input Price$0.25 / 1M tokens
Output Price$0.75 / 1M tokens
Blended Price (3:1)$0.375 / 1M tokens
Speed
Tokens/sec881.5 tokens/s
Time to First Token3.71s
Time to Answer3.71s
Available Providers
(LS internal units)| Provider | Input Price | Output Price |
|---|---|---|
| Inception | 250K | 750K |