Skip to main content

DeepSeek VL2 Small

DeepSeekDeepSeekOpen Weightdeepseek

Description

An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

Release Date
2024-12-13
Parameters
16.0B
Context Length
164K
Modalities
text

Capability Radar

60
general
0
coding
60
reasoning
43
scienceest.
0
agents
0
multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain#RankScoreSource
Multimodal Ranking48
75.0
LS

Benchmark Scores (LLM Stats)

General

MMT-Bench62.9%SR
MMStar57.0%SR
MMMU48.0%SR

Image To Text

DocVQA92.3%SR
TextVQA83.4%SR
OCRBench83.4%SR

Math

MathVista60.7%SR

Multimodal

ChartQA84.5%SR
MMBench80.3%SR
AI2D80.0%SR
MMBench-V1.179.3%SR
InfoVQA75.8%SR
MME21.2%SR

Spatial Reasoning

RealWorldQA65.4%SR

AA Evaluation Indices

No AA evaluation data available

LLM Stats Category Scores

Image To Text
90
Spatial Reasoning
70
Vision
70
Multimodal
70
General
60
Math
60
Reasoning
60
Healthcare
50

Pricing

Input Price$0.32 / 1M tokens
Output Price$0.89 / 1M tokens
Blended Price (3:1)$0.4625 / 1M tokens

Speed

No speed data available

Available Providers

(LS internal units)

No provider data available

External Sources