LongCat Flash Lite

LongCatOpen WeightMIT · Commercial OK

Description

LongCat-Flash-Lite is a lightweight MoE model from Meituan with 68.5B total parameters and only 2.9B-4.5B activated per token. It explores N-gram embedding expansion as a new scaling direction, supporting 256K context length via YaRN. Optimized for agent tooling and programming tasks, achieving 500-700 tokens per second inference speed while maintaining strong performance on coding, math, and agentic benchmarks.

Release Date

2026-01-28

Parameters

68.5B

Context Length

—

Modalities

text

Capability Radar

general

coding

reasoning

scienceest.

agents

multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain	#Rank	Score	Source
Agentic Capability	114	34.0	LS
Code Ranking	321	26.0	AA
General Ranking	231	45.0	AA
Science	283	40.0	AA

Benchmark Scores (LLM Stats)

Agents

Terminal-Bench

33.8%SR

Biology

GPQA

66.8%SR

Code

SWE-Bench Verified

54.4%SR

SWE-bench Multilingual

38.1%SR

Communication

Tau2 Retail

73.1%SR

Tau2 Telecom

72.8%SR

Tau2 Airline

58.0%SR

Finance

MMLU

85.5%SR

MMLU-Pro

78.3%SR

General

CMMLU

82.5%SR

Math

MATH-500

96.8%SR

AIME 2024

72.2%SR

AIME 2025

63.2%SR

AA Evaluation Indices

Intelligence Index

17.2

Tau2

0.8

Gpqa

0.6

Ifbench

0.4

Scicode

0.3

Lcr

0.3

Terminalbench Hard

0.1

Hle

0.1

LLM Stats Category Scores

Language

Legal

Math

Finance

General

Healthcare

Physics

Reasoning

Biology

Chemistry

Communication

Tool Calling

Frontend Development

Code

Agents

Pricing

Input PriceFree

Output PriceFree

Blended Price (3:1)Free

Speed

Tokens/sec0.0

Time to First Token0.00s

Time to Answer0.00s

Provider Price Ranking

1 providers

ProviderInputOutput

1Meituan

Compare pricing across different API providers for this model.

External Sources

LLM Stats Artificial Analysis