Перейти к основному содержанию

Seed 2.1 Pro

ByteDanceProprietary

Описание

ByteDance's flagship next-generation agent model built for real-world productivity. A deep-thinking model with strong demand understanding, long-horizon planning, and continuous self-repair, it delivers reliable end-to-end results across complex coding, long-chain agents, and multi-step engineering workflows. Seed 2.1 Pro also advances knowledge, reasoning, and multimodal understanding, with SOTA results across several video understanding benchmarks. Served via Volcano Engine as Doubao-Seed-2.1-pro.

Дата выхода
2026-06-24
Параметры
Длина контекста
Модальности

Радар способностей

80
general
60
coding
70
reasoning
51
scienceоцен.
70
agents
70
multimodal

Science использует прокси на основе рассуждений, когда специализированные научные бенчмарки недоступны.

Рейтинги

Домен#МестоОценкаИсточник
Агентные возможности38
60.0
LS
Мультимодальный рейтинг70
70.0
LS
Рассуждения79
56.0
LS

Оценки бенчмарков (LLM Stats)

3d

BLINK81.4%Сам.

Agents

GDPval87.9%Сам.
BrowseComp86.2%Сам.
MCP Atlas83.8%Сам.
OSWorld78.8%Сам.
Web Bench78.4%Сам.
MobileWorld73.1%Сам.
OfficeQA Pro72.2%Сам.
Terminal-Bench 2.171.0%Сам.
CyberGym70.2%Сам.
OneMillion Bench68.8%Сам.
Agent Startup Bench68.8%Сам.
SeedClawBench66.6%Сам.
Trae Error Fix63.3%Сам.
Trae Code Gen62.4%Сам.
WildClawBench61.7%Сам.
xDailyBench61.0%Сам.
Finance Agent v1.160.7%Сам.
SWE-Bench Pro57.5%Сам.
Repo Env55.0%Сам.
PresentBench54.6%Сам.
Workspace Bench53.0%Сам.
Doubao Multi-Turn Bench52.5%Сам.
ClawEval-MM51.0%Сам.
Toolathlon50.6%Сам.
Program Bench50.3%Сам.
NL2Repo47.0%Сам.
CreativeWork42.5%Сам.
Agents' Last Exam41.4%Сам.
SWE-Atlas35.2%Сам.
APEX-Agents33.8%Сам.
DeepSWE32.7%Сам.
GameWorld31.2%Сам.
PostTrainBench16.5%Сам.

Biology

SciCode59.8%Сам.

Chemistry

SuperGPQA70.8%Сам.
SuperChem59.8%Сам.

Code

Artifacts Bench51.0%Сам.
FrontierCS46.3%Сам.

Coding

AetherCode65.8%Сам.
Image2FloorPlan48.0%Сам.

Embodied

EmbSpatialBench0.83 / 100Сам.

General

MMMU-Pro82.7%Сам.
SimpleVQA0.74 / 100Сам.
MSQA50.2%Сам.
KINA48.3%Сам.

Image To Text

OCRBench_V263.2%Сам.

Knowledge

VideoSimpleQA76.4%Сам.
WorldBench67.6%Сам.

Long Context

DUDE82.8%Сам.
LongVideoBench80.6%Сам.
MMLongBench-128K78.3%Сам.
LVBench78.0%Сам.

Math

MathVision94.5%Сам.
MathVista90.7%Сам.
MathVerse89.7%Сам.
Beyond AIME87.0%Сам.
EMMA79.3%Сам.
FrontierScience Olympiad75.0%Сам.
DynaMath73.1%Сам.
IMO 20250.65 / 42Сам.
Humanity's Last Exam55.7%Сам.
IMOProof-Adv54.3%Сам.
MathArena Apex31.3%Сам.
LiveMathematicianBench20.9%Сам.
HorizonMath2.0%Сам.

Multimodal

CharXiv-D95.5%Сам.
Video-MME89.2%Сам.
CharXiv-R86.4%Сам.
VLMsAreBiased83.6%Сам.
OVOBench80.7%Сам.
TVBench80.5%Сам.
TOMATO79.5%Сам.
LiveSports-3K76.8%Сам.
MotionBench74.9%Сам.
BabyVision73.7%Сам.
TreeBench71.1%Сам.
ChartQAPro70.9%Сам.
Minerva70.7%Сам.
OVBench70.0%Сам.
VideoHolmes68.2%Сам.
CrossVid65.0%Сам.
ContPhy63.6%Сам.
MeasureBench62.9%Сам.
ZEROBench0.56 / 100Сам.
VisuLogic0.54 / 100Сам.
WorldVQA53.0%Сам.
VisFactor51.4%Сам.
MMSIBench35.9%Сам.

Physics

IPhO 202579.3%Сам.

Reasoning

ERQA72.0%Сам.
ArcAGI262.5%Сам.
FrontierScience Research28.3%Сам.

Spatial Reasoning

RealWorldQA86.7%Сам.

Индексы оценки AA

Нет данных AA оценки

Оценки категорий LLM Stats

Structured Output
100
Search
90
Legal
80
Long Context
80
Spatial Reasoning
80
Embodied
80
Finance
80
General
80
3d
80
Image To Text
70
Math
70
Multimodal
70
Physics
70
Reasoning
70
Safety
70
Healthcare
70
Chemistry
70
Economics
70
Tool Calling
70
Video
70
Vision
70
Agents
60
Biology
60
Code
60
Frontend Development
50
Coding
50
Science
30
Systems
20

Цены

Нет данных о ценах

Скорость

Нет данных о скорости

Рейтинг цен провайдеров

Нет данных провайдеров

Внешние ссылки