LongCat-Flash-Chat

MeituanOpen WeightMIT · Uso Comercial

Descripción

LongCat-Flash-Chat is Meituan's first open-source foundation model, a 560B parameter Mixture-of-Experts (MoE) model that dynamically activates 18.6B-31.3B parameters (~27B average) based on contextual demands. It features Zero-Computation Experts for efficient routing and supports 128K context. Optimized for conversational and agentic tasks, it shows competitive performance across reasoning, coding, instruction following, and domain benchmarks with particular strengths in tool use and complex multi-step interactions. Achieves over 100 tokens per second on H800 GPUs.

Fecha de lanzamiento

2025-08-29

Parámetros

560.0B

Longitud del contexto

—

Modalidades

text

Radar de capacidades

general

coding

reasoning

scienceest.

agents

multimodal

Science usa un proxy de razonamiento cuando los benchmarks científicos dedicados no están disponibles.

Rankings

Dominio	#Posición	Puntuación	Fuente
Capacidad agéntica	104	40.0	LS
Razonamiento	11	89.0	LS

Puntuaciones de benchmarks (LLM Stats)

Agents

Terminal-Bench

39.5%Aut.

Biology

GPQA

73.2%Aut.

Code

HumanEval

88.4%Aut.

SWE-Bench Verified

60.4%Aut.

LiveCodeBench

48.0%Aut.

Communication

Tau2 Telecom

73.7%Aut.

Tau2 Retail

71.3%Aut.

Tau2 Airline

58.0%Aut.

Finance

MMLU

89.7%Aut.

MMLU-Pro

82.7%Aut.

General

IFEval

89.6%Aut.

CMMLU

84.3%Aut.

Math

MATH-500

96.4%Aut.

DROP

79.1%Aut.

AIME 2025

61.3%Aut.

Reasoning

ZebraLogic

89.3%Aut.

Índices de evaluación AA

No hay datos de evaluación AA disponibles

Puntuaciones por categoría LLM Stats

Instruction Following

Language

Legal

Structured Output

Finance

Healthcare

Math

General

Physics

Reasoning

Biology

Chemistry

Communication

Tool Calling

Frontend Development

Code

Agents

Precios

No hay datos de precios disponibles

Velocidad

No hay datos de velocidad disponibles

Ranking de Precios por Proveedor

No hay datos de proveedores disponibles

Fuentes externas

LLM Stats Artificial Analysis