DeepSeek VL2
DeepSeekDeepSeekOpen Weightdeepseek
Description
An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.
Release Date
2024-12-13
Parameters
27.0B
Context Length
—
Modalities
image, text
Capability Radar
60
general
0
coding
60
reasoning
43
scienceest.
42
agents
90
multimodal
Science uses a reasoning proxy when dedicated science benchmarks are unavailable.
Rankings
| Domain | #Rank | Score | Source |
|---|---|---|---|
| Multimodal Ranking | 47 | 76.0 | LS |
Benchmark Scores (LLM Stats)
General
MMT-Bench
63.6%SR
MMStar
61.3%SR
MMMU
51.1%SR
Image To Text
DocVQA
93.3%SR
TextVQA
84.2%SR
OCRBench
81.1%SR
Math
MathVista
62.8%SR
Multimodal
ChartQA
86.0%SR
AI2D
81.4%SR
MMBench
79.6%SR
MMBench-V1.1
79.2%SR
InfoVQA
78.1%SR
MME
22.5%SR
Spatial Reasoning
RealWorldQA
68.4%SR
AA Evaluation Indices
No AA evaluation data available
LLM Stats Category Scores
Image To Text90
Multimodal70
Reasoning70
Spatial Reasoning70
Vision70
Math60
General60
Healthcare50
Pricing
No pricing data available
Speed
No speed data available
Provider Price Ranking
Provider Price Ranking
2 providers
Cheapest: SiliconFlow (China)Most Expensive: SiliconFlow
ProviderInputOutput
1SiliconFlow (China)Cheapest
$0.15
$0.15
2SiliconFlow
$0.15
$0.15
Compare pricing across different API providers for this model.