메인 콘텐츠로 건너뛰기

GLM-5.1 (Reasoning)

Z AIGLMOpen WeightMIT · Commercial OK

설명

GLM-5.1 is Z.AI's next-generation flagship foundation model designed for long-horizon agentic engineering tasks. Built on a 754B MoE architecture (40B active parameters), it can work continuously and autonomously on a single task for up to 8 hours, completing the full loop from planning and execution to iterative optimization and delivery. GLM-5.1 achieves state-of-the-art on SWE-Bench Pro (58.4) and demonstrates strong performance across coding, reasoning, and agentic benchmarks. It supports 200K context length, 128K max output tokens, thinking mode, function calling, structured output, context caching, and MCP integration. Overall performance is aligned with Claude Opus 4.6 with particular strengths in sustained execution and complex engineering optimization.

출시일
2026-04-07
파라미터
754.0B
컨텍스트 길이
203K
모달리티
text

능력 레이더

46
general
43
coding
87
reasoning
60
science추정
60
agents
0
multimodal

전용 과학 벤치마크가 없을 때 Science는 추론 프록시를 사용하여 추정합니다.

랭킹

도메인#순위점수소스
Agents & Tools21
67.0
LS
Code Ranking40
75.0
AA
General Ranking9
90.0
AA
Science33
76.0
AA

벤치마크 점수 (LLM Stats)

Agents

Vending-Bench 2563441.0%자체 보고
BrowseComp79.3%자체 보고
MCP Atlas71.8%자체 보고
TAU3-Bench70.6%자체 보고
Terminal-Bench 2.069.0%자체 보고
CyberGym68.7%자체 보고
SWE-Bench Pro58.4%자체 보고
NL2Repo42.7%자체 보고
Toolathlon40.7%자체 보고

Biology

GPQA86.2%자체 보고

Math

AIME 202695.3%자체 보고
HMMT 202594.0%자체 보고
IMO-AnswerBench83.8%자체 보고
HMMT Feb 2682.6%자체 보고
Humanity's Last Exam52.3%자체 보고

AA 평가 지수

Intelligence Index
51.4
Coding Index
43.4
Tau2
1.0
Gpqa
0.9
Ifbench
0.8
Lcr
0.6
Scicode
0.4
Terminalbench Hard
0.4
Hle
0.3

LLM Stats 카테고리 점수

Agents
100
Reasoning
100
Biology
90
Chemistry
90
General
90
Physics
90
Math
80
Search
80
Code
70
Safety
70
Tool Calling
60
Vision
50
Coding
40

가격

입력 가격$1.4 / 1M tokens
출력 가격$4.4 / 1M tokens
혼합 가격 (3:1)$2.15 / 1M tokens

속도

토큰/초53.8 tokens/s
첫 토큰 지연1.04s
첫 응답 지연71.55s

사용 가능한 프로바이더

(LS 내부 단위)
프로바이더입력 가격출력 가격
ZAI1.4M4.4M

외부 링크