Skip to main content

GLM-4.7-Flash (Non-reasoning)

Z AIGLMOpen WeightMIT · Commercial OK

Description

GLM-4.7-Flash is a high-speed, cost-efficient variant of GLM-4.7 optimized for fast inference and lower latency. It retains the coding-centric capabilities of GLM-4.7 including thinking before acting, preserved reasoning across turns, and per-request thinking control for speed or accuracy trade-offs. Ideal for applications requiring quick responses while maintaining strong performance on coding, agentic workflows, and general reasoning tasks.

Release Date
2026-01-19
Parameters
30.0B
Context Length
203K
Modalities
text

Capability Radar

18
general
13
coding
45
reasoning
30
scienceest.
80
agents
0
multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain#RankScoreSource
Agents & Tools30
64.0
LS
Code Ranking375
16.0
AA
General Ranking195
51.0
AA
Science354
31.0
AA

Benchmark Scores (LLM Stats)

Agents

Tau-bench79.5%SR
BrowseComp42.8%SR

Biology

GPQA75.2%SR

Code

SWE-Bench Verified59.2%SR

Math

AIME 202591.6%SR
Humanity's Last Exam14.4%SR

AA Evaluation Indices

Intelligence Index
22.1
Coding Index
11.0
Tau2
0.9
Ifbench
0.5
Gpqa
0.5
Scicode
0.3
Lcr
0.1
Hle
0.0
Terminalbench Hard
0.0

LLM Stats Category Scores

Tool Calling
80
Biology
80
Chemistry
80
General
80
Physics
80
Agents
60
Code
60
Frontend Development
60
Reasoning
60
Math
50
Search
40
Vision
10

Pricing

Input Price$0.07 / 1M tokens
Output Price$0.4 / 1M tokens
Blended Price (3:1)$0.153 / 1M tokens

Speed

Tokens/sec94.6 tokens/s
Time to First Token0.89s
Time to Answer0.89s

Available Providers

(LS internal units)
ProviderInput PriceOutput Price
ZAI70K400K

External Sources