Mercury 2

InceptionProprietary

Description

Mercury 2 is the fastest reasoning LLM, built on diffusion-based language model (dLLM) architecture. Instead of generating text token-by-token, it refines multiple text blocks simultaneously, achieving over 1,000 tokens per second on Nvidia Blackwell GPUs — 5x faster than leading speed-optimized LLMs. Supports tool usage and JSON output with 128K context window.

Release Date

2026-02-20

Parameters

—

Context Length

128K

Modalities

text

Capability Radar

general

coding

reasoning

scienceest.

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Code Ranking	155	49.0	AA
General Ranking	115	64.0	AA
Science	106	60.0	AA

Benchmark Scores (LLM Stats)

Biology

GPQA

74.0%SR

SciCode

38.0%SR

Code

LiveCodeBench

67.0%SR

Communication

Tau2 Airline

53.0%SR

General

IFBench

71.0%SR

Math

AIME 2025

91.1%SR

AA Evaluation Indices

Intelligence Index

32.8

Coding Index

30.6

Gpqa

0.8

Tau2

0.7

Ifbench

0.7

Scicode

0.4

Lcr

0.4

Terminalbench Hard

0.3

Hle

0.2

LLM Stats Category Scores

General

Instruction Following

Biology

Chemistry

Math

Physics

Reasoning

Tool Calling

Code

Communication

Pricing

Input Price$0.25 / 1M tokens

Output Price$0.75 / 1M tokens

Blended Price (3:1)$0.375 / 1M tokens

Speed

Tokens/sec881.5 tokens/s

Time to First Token3.71s

Time to Answer3.71s

Available Providers

(LS internal units)

Provider	Input Price	Output Price
Inception	250K	750K

External Sources

LLM Stats