Skip to main content

DeepSeek VL2

DeepSeekDeepSeekOpen Weightdeepseek

Description

An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

Release Date
2024-12-13
Parameters
27.0B
Context Length
Modalities
image, text

Capability Radar

60
general
0
coding
60
reasoning
43
scienceest.
42
agents
90
multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain#RankScoreSource
Multimodal Ranking47
76.0
LS

Benchmark Scores (LLM Stats)

General

MMT-Bench63.6%SR
MMStar61.3%SR
MMMU51.1%SR

Image To Text

DocVQA93.3%SR
TextVQA84.2%SR
OCRBench81.1%SR

Math

MathVista62.8%SR

Multimodal

ChartQA86.0%SR
AI2D81.4%SR
MMBench79.6%SR
MMBench-V1.179.2%SR
InfoVQA78.1%SR
MME22.5%SR

Spatial Reasoning

RealWorldQA68.4%SR

AA Evaluation Indices

No AA evaluation data available

LLM Stats Category Scores

Image To Text
90
Multimodal
70
Reasoning
70
Spatial Reasoning
70
Vision
70
Math
60
General
60
Healthcare
50

Pricing

No pricing data available

Speed

No speed data available

Provider Price Ranking

Provider Price Ranking

2 providers

Cheapest: SiliconFlow (China)Most Expensive: SiliconFlow
ProviderInputOutput
1SiliconFlow (China)Cheapest
$0.15
$0.15
2SiliconFlow
$0.15
$0.15

Compare pricing across different API providers for this model.

External Sources