Seed 2.1 Pro

ByteDanceProprietary

描述

ByteDance's flagship next-generation agent model built for real-world productivity. A deep-thinking model with strong demand understanding, long-horizon planning, and continuous self-repair, it delivers reliable end-to-end results across complex coding, long-chain agents, and multi-step engineering workflows. Seed 2.1 Pro also advances knowledge, reasoning, and multimodal understanding, with SOTA results across several video understanding benchmarks. Served via Volcano Engine as Doubao-Seed-2.1-pro.

發布日期

2026-06-24

參數規模

—

上下文長度

—

支援模態

—

能力雷達圖

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少專門科學評測時使用推理能力代理估算。

排行榜排名

領域	#排名	分數	來源
智慧體能力模型榜	38	60.0	LS
多模態榜	70	70.0	LS
推理能力	79	56.0	LS

基準測試分數 (LLM Stats)

3d

BLINK

81.4%自報

Agents

GDPval

87.9%自報

BrowseComp

86.2%自報

MCP Atlas

83.8%自報

OSWorld

78.8%自報

Web Bench

78.4%自報

MobileWorld

73.1%自報

OfficeQA Pro

72.2%自報

Terminal-Bench 2.1

71.0%自報

CyberGym

70.2%自報

OneMillion Bench

68.8%自報

Agent Startup Bench

68.8%自報

SeedClawBench

66.6%自報

Trae Error Fix

63.3%自報

Trae Code Gen

62.4%自報

WildClawBench

61.7%自報

xDailyBench

61.0%自報

Finance Agent v1.1

60.7%自報

SWE-Bench Pro

57.5%自報

Repo Env

55.0%自報

PresentBench

54.6%自報

Workspace Bench

53.0%自報

Doubao Multi-Turn Bench

52.5%自報

ClawEval-MM

51.0%自報

Toolathlon

50.6%自報

Program Bench

50.3%自報

NL2Repo

47.0%自報

CreativeWork

42.5%自報

Agents' Last Exam

41.4%自報

SWE-Atlas

35.2%自報

APEX-Agents

33.8%自報

DeepSWE

32.7%自報

GameWorld

31.2%自報

PostTrainBench

16.5%自報

Biology

SciCode

59.8%自報

Chemistry

SuperGPQA

70.8%自報

SuperChem

59.8%自報

Code

Artifacts Bench

51.0%自報

FrontierCS

46.3%自報

Coding

AetherCode

65.8%自報

Image2FloorPlan

48.0%自報

Embodied

EmbSpatialBench

0.83 / 100自報

General

MMMU-Pro

82.7%自報

SimpleVQA

0.74 / 100自報

MSQA

50.2%自報

KINA

48.3%自報

Image To Text

OCRBench_V2

63.2%自報

Knowledge

VideoSimpleQA

76.4%自報

WorldBench

67.6%自報

Long Context

DUDE

82.8%自報

LongVideoBench

80.6%自報

MMLongBench-128K

78.3%自報

LVBench

78.0%自報

Math

MathVision

94.5%自報

MathVista

90.7%自報

MathVerse

89.7%自報

Beyond AIME

87.0%自報

EMMA

79.3%自報

FrontierScience Olympiad

75.0%自報

DynaMath

73.1%自報

IMO 2025

0.65 / 42自報

Humanity's Last Exam

55.7%自報

IMOProof-Adv

54.3%自報

MathArena Apex

31.3%自報

LiveMathematicianBench

20.9%自報

HorizonMath

2.0%自報

Multimodal

CharXiv-D

95.5%自報

Video-MME

89.2%自報

CharXiv-R

86.4%自報

VLMsAreBiased

83.6%自報

OVOBench

80.7%自報

TVBench

80.5%自報

TOMATO

79.5%自報

LiveSports-3K

76.8%自報

MotionBench

74.9%自報

BabyVision

73.7%自報

TreeBench

71.1%自報

ChartQAPro

70.9%自報

Minerva

70.7%自報

OVBench

70.0%自報

VideoHolmes

68.2%自報

CrossVid

65.0%自報

ContPhy

63.6%自報

MeasureBench

62.9%自報

ZEROBench

0.56 / 100自報

VisuLogic

0.54 / 100自報

WorldVQA

53.0%自報

VisFactor

51.4%自報

MMSIBench

35.9%自報

Physics

IPhO 2025

79.3%自報

Reasoning

ERQA

72.0%自報

ArcAGI2

62.5%自報

FrontierScience Research

28.3%自報

Spatial Reasoning

RealWorldQA

86.7%自報

AA 評測指數

暫無 AA 評測資料

LLM Stats 分類評分

Structured Output

100

Legal

Long Context

Spatial Reasoning

Embodied

Finance

General

Image To Text

Math

Multimodal

Physics

Reasoning

Safety

Healthcare

Chemistry

Economics

Tool Calling

Video

Vision

Agents

Biology

Code

Frontend Development

Coding

Science

Systems

定價

暫無定價資料

速度

暫無速度資料

供應商價格排行

暫無提供商資料

外部連結

LLM Stats Artificial Analysis