GLM-5.1 (Reasoning)

Z AIGLMOpen WeightMIT · Commercial OK

설명

GLM-5.1 is Z.AI's next-generation flagship foundation model designed for long-horizon agentic engineering tasks. Built on a 754B MoE architecture (40B active parameters), it can work continuously and autonomously on a single task for up to 8 hours, completing the full loop from planning and execution to iterative optimization and delivery. GLM-5.1 achieves state-of-the-art on SWE-Bench Pro (58.4) and demonstrates strong performance across coding, reasoning, and agentic benchmarks. It supports 200K context length, 128K max output tokens, thinking mode, function calling, structured output, context caching, and MCP integration. Overall performance is aligned with Claude Opus 4.6 with particular strengths in sustained execution and complex engineering optimization.

출시일

2026-04-07

파라미터

754.0B

컨텍스트 길이

203K

모달리티

text

능력 레이더

general

coding

reasoning

science추정

agents

multimodal

전용 과학 벤치마크가 없을 때 Science는 추론 프록시를 사용하여 추정합니다.

랭킹

도메인	#순위	점수	소스
Agents & Tools	21	67.0	LS
Code Ranking	40	75.0	AA
General Ranking	9	90.0	AA
Science	33	76.0	AA

벤치마크 점수 (LLM Stats)

Agents

Vending-Bench 2

563441.0%자체 보고

BrowseComp

79.3%자체 보고

MCP Atlas

71.8%자체 보고

TAU3-Bench

70.6%자체 보고

Terminal-Bench 2.0

69.0%자체 보고

CyberGym

68.7%자체 보고

SWE-Bench Pro

58.4%자체 보고

NL2Repo

42.7%자체 보고

Toolathlon

40.7%자체 보고

Biology

GPQA

86.2%자체 보고

Math

AIME 2026

95.3%자체 보고

HMMT 2025

94.0%자체 보고

IMO-AnswerBench

83.8%자체 보고

HMMT Feb 26

82.6%자체 보고

Humanity's Last Exam

52.3%자체 보고

AA 평가 지수

Intelligence Index

51.4

Coding Index

43.4

Tau2

1.0

Gpqa

0.9

Ifbench

0.8

Lcr

0.6

Scicode

0.4

Terminalbench Hard

0.4

Hle

0.3

LLM Stats 카테고리 점수

Agents

100

Reasoning

100

Biology

Chemistry

General

Physics

Math

Code

Safety

Tool Calling

Vision

Coding

가격

입력 가격$1.4 / 1M tokens

출력 가격$4.4 / 1M tokens

혼합 가격 (3:1)$2.15 / 1M tokens

속도

토큰/초53.8 tokens/s

첫 토큰 지연1.04s

첫 응답 지연71.55s

사용 가능한 프로바이더

(LS 내부 단위)

프로바이더	입력 가격	출력 가격
ZAI	1.4M	4.4M

외부 링크

LLM Stats