Seed 2.1 Pro

ByteDanceProprietary

描述

ByteDance's flagship next-generation agent model built for real-world productivity. A deep-thinking model with strong demand understanding, long-horizon planning, and continuous self-repair, it delivers reliable end-to-end results across complex coding, long-chain agents, and multi-step engineering workflows. Seed 2.1 Pro also advances knowledge, reasoning, and multimodal understanding, with SOTA results across several video understanding benchmarks. Served via Volcano Engine as Doubao-Seed-2.1-pro.

发布日期

2026-06-24

参数规模

—

上下文长度

—

支持模态

—

能力雷达图

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域	#排名	分数	来源
智能体能力模型榜	38	60.0	LS
多模态榜	70	70.0	LS
推理能力	79	56.0	LS

基准测试分数 (LLM Stats)

3d

BLINK

81.4%自报

Agents

GDPval

87.9%自报

BrowseComp

86.2%自报

MCP Atlas

83.8%自报

OSWorld

78.8%自报

Web Bench

78.4%自报

MobileWorld

73.1%自报

OfficeQA Pro

72.2%自报

Terminal-Bench 2.1

71.0%自报

CyberGym

70.2%自报

OneMillion Bench

68.8%自报

Agent Startup Bench

68.8%自报

SeedClawBench

66.6%自报

Trae Error Fix

63.3%自报

Trae Code Gen

62.4%自报

WildClawBench

61.7%自报

xDailyBench

61.0%自报

Finance Agent v1.1

60.7%自报

SWE-Bench Pro

57.5%自报

Repo Env

55.0%自报

PresentBench

54.6%自报

Workspace Bench

53.0%自报

Doubao Multi-Turn Bench

52.5%自报

ClawEval-MM

51.0%自报

Toolathlon

50.6%自报

Program Bench

50.3%自报

NL2Repo

47.0%自报

CreativeWork

42.5%自报

Agents' Last Exam

41.4%自报

SWE-Atlas

35.2%自报

APEX-Agents

33.8%自报

DeepSWE

32.7%自报

GameWorld

31.2%自报

PostTrainBench

16.5%自报

Biology

SciCode

59.8%自报

Chemistry

SuperGPQA

70.8%自报

SuperChem

59.8%自报

Code

Artifacts Bench

51.0%自报

FrontierCS

46.3%自报

Coding

AetherCode

65.8%自报

Image2FloorPlan

48.0%自报

Embodied

EmbSpatialBench

0.83 / 100自报

General

MMMU-Pro

82.7%自报

SimpleVQA

0.74 / 100自报

MSQA

50.2%自报

KINA

48.3%自报

Image To Text

OCRBench_V2

63.2%自报

Knowledge

VideoSimpleQA

76.4%自报

WorldBench

67.6%自报

Long Context

DUDE

82.8%自报

LongVideoBench

80.6%自报

MMLongBench-128K

78.3%自报

LVBench

78.0%自报

Math

MathVision

94.5%自报

MathVista

90.7%自报

MathVerse

89.7%自报

Beyond AIME

87.0%自报

EMMA

79.3%自报

FrontierScience Olympiad

75.0%自报

DynaMath

73.1%自报

IMO 2025

0.65 / 42自报

Humanity's Last Exam

55.7%自报

IMOProof-Adv

54.3%自报

MathArena Apex

31.3%自报

LiveMathematicianBench

20.9%自报

HorizonMath

2.0%自报

Multimodal

CharXiv-D

95.5%自报

Video-MME

89.2%自报

CharXiv-R

86.4%自报

VLMsAreBiased

83.6%自报

OVOBench

80.7%自报

TVBench

80.5%自报

TOMATO

79.5%自报

LiveSports-3K

76.8%自报

MotionBench

74.9%自报

BabyVision

73.7%自报

TreeBench

71.1%自报

ChartQAPro

70.9%自报

Minerva

70.7%自报

OVBench

70.0%自报

VideoHolmes

68.2%自报

CrossVid

65.0%自报

ContPhy

63.6%自报

MeasureBench

62.9%自报

ZEROBench

0.56 / 100自报

VisuLogic

0.54 / 100自报

WorldVQA

53.0%自报

VisFactor

51.4%自报

MMSIBench

35.9%自报

Physics

IPhO 2025

79.3%自报

Reasoning

ERQA

72.0%自报

ArcAGI2

62.5%自报

FrontierScience Research

28.3%自报

Spatial Reasoning

RealWorldQA

86.7%自报

AA 评测指数

暂无 AA 评测数据

LLM Stats 分类评分

Structured Output

100

Legal

Long Context

Spatial Reasoning

Embodied

Finance

General

Image To Text

Math

Multimodal

Physics

Reasoning

Safety

Healthcare

Chemistry

Economics

Tool Calling

Video

Vision

Agents

Biology

Code

Frontend Development

Coding

Science

Systems

定价

暂无定价数据

速度

暂无速度数据

供应商价格排行

暂无提供商数据

外部链接

LLM Stats Artificial Analysis