Seed 2.1 Pro

ByteDanceProprietary

説明

ByteDance's flagship next-generation agent model built for real-world productivity. A deep-thinking model with strong demand understanding, long-horizon planning, and continuous self-repair, it delivers reliable end-to-end results across complex coding, long-chain agents, and multi-step engineering workflows. Seed 2.1 Pro also advances knowledge, reasoning, and multimodal understanding, with SOTA results across several video understanding benchmarks. Served via Volcano Engine as Doubao-Seed-2.1-pro.

リリース日

2026-06-24

パラメータ

—

コンテキスト長

—

モダリティ

—

能力レーダー

general

coding

reasoning

science推定

agents

multimodal

専門的な科学ベンチマークが利用できない場合、Scienceは推論プロキシを使用して推定します。

ベンチマークスコア (LLM Stats)

3d

BLINK

81.4%自己申告

Agents

GDPval

87.9%自己申告

BrowseComp

86.2%自己申告

MCP Atlas

83.8%自己申告

OSWorld

78.8%自己申告

Web Bench

78.4%自己申告

MobileWorld

73.1%自己申告

OfficeQA Pro

72.2%自己申告

Terminal-Bench 2.1

71.0%自己申告

CyberGym

70.2%自己申告

OneMillion Bench

68.8%自己申告

Agent Startup Bench

68.8%自己申告

SeedClawBench

66.6%自己申告

Trae Error Fix

63.3%自己申告

Trae Code Gen

62.4%自己申告

WildClawBench

61.7%自己申告

xDailyBench

61.0%自己申告

Finance Agent v1.1

60.7%自己申告

SWE-Bench Pro

57.5%自己申告

Repo Env

55.0%自己申告

PresentBench

54.6%自己申告

Workspace Bench

53.0%自己申告

Doubao Multi-Turn Bench

52.5%自己申告

ClawEval-MM

51.0%自己申告

Toolathlon

50.6%自己申告

Program Bench

50.3%自己申告

NL2Repo

47.0%自己申告

CreativeWork

42.5%自己申告

Agents' Last Exam

41.4%自己申告

SWE-Atlas

35.2%自己申告

APEX-Agents

33.8%自己申告

DeepSWE

32.7%自己申告

GameWorld

31.2%自己申告

PostTrainBench

16.5%自己申告

Biology

SciCode

59.8%自己申告

Chemistry

SuperGPQA

70.8%自己申告

SuperChem

59.8%自己申告

Code

Artifacts Bench

51.0%自己申告

FrontierCS

46.3%自己申告

Coding

AetherCode

65.8%自己申告

Image2FloorPlan

48.0%自己申告

Embodied

EmbSpatialBench

0.83 / 100自己申告

General

MMMU-Pro

82.7%自己申告

SimpleVQA

0.74 / 100自己申告

MSQA

50.2%自己申告

KINA

48.3%自己申告

Image To Text

OCRBench_V2

63.2%自己申告

Knowledge

VideoSimpleQA

76.4%自己申告

WorldBench

67.6%自己申告

Long Context

DUDE

82.8%自己申告

LongVideoBench

80.6%自己申告

MMLongBench-128K

78.3%自己申告

LVBench

78.0%自己申告

Math

MathVision

94.5%自己申告

MathVista

90.7%自己申告

MathVerse

89.7%自己申告

Beyond AIME

87.0%自己申告

EMMA

79.3%自己申告

FrontierScience Olympiad

75.0%自己申告

DynaMath

73.1%自己申告

IMO 2025

0.65 / 42自己申告

Humanity's Last Exam

55.7%自己申告

IMOProof-Adv

54.3%自己申告

MathArena Apex

31.3%自己申告

LiveMathematicianBench

20.9%自己申告

HorizonMath

2.0%自己申告

Multimodal

CharXiv-D

95.5%自己申告

Video-MME

89.2%自己申告

CharXiv-R

86.4%自己申告

VLMsAreBiased

83.6%自己申告

OVOBench

80.7%自己申告

TVBench

80.5%自己申告

TOMATO

79.5%自己申告

LiveSports-3K

76.8%自己申告

MotionBench

74.9%自己申告

BabyVision

73.7%自己申告

TreeBench

71.1%自己申告

ChartQAPro

70.9%自己申告

Minerva

70.7%自己申告

OVBench

70.0%自己申告

VideoHolmes

68.2%自己申告

CrossVid

65.0%自己申告

ContPhy

63.6%自己申告

MeasureBench

62.9%自己申告

ZEROBench

0.56 / 100自己申告

VisuLogic

0.54 / 100自己申告

WorldVQA

53.0%自己申告

VisFactor

51.4%自己申告

MMSIBench

35.9%自己申告

Physics

IPhO 2025

79.3%自己申告

Reasoning

ERQA

72.0%自己申告

ArcAGI2

62.5%自己申告

FrontierScience Research

28.3%自己申告

Spatial Reasoning

RealWorldQA

86.7%自己申告

AA評価指数

AA評価データがありません

LLM Statsカテゴリスコア

Structured Output

100

Legal

Long Context

Spatial Reasoning

Embodied

Finance

General

Image To Text

Math

Multimodal

Physics

Reasoning

Safety

Healthcare

Chemistry

Economics

Tool Calling

Video

Vision

Agents

Biology

Code

Frontend Development

Coding

Science

Systems

価格設定

価格データがありません

速度

速度データがありません

プロバイダー価格ランキング

プロバイダーデータがありません

外部リンク

LLM Stats Artificial Analysis

ドメイン	#順位	スコア	ソース
エージェント能力	38	60.0	LS
マルチモーダルランキング	70	70.0	LS
推論	79	56.0	LS