Skip to main content

Llama 3.2 Instruct 90B (Vision)

MetaLlamaOpen WeightLlama 3.2 · Commercial OK

Description

Llama 3.2 90B is a large multimodal language model optimized for visual recognition, image reasoning, and captioning tasks. It supports a context length of 128,000 tokens and is designed for deployment on edge and mobile devices, offering state-of-the-art performance in image understanding and generative tasks.

Release Date
2024-09-25
Parameters
90.0B
Context Length
Modalities
image, text

Capability Radar

24
general
22
coding
30
reasoning
29
scienceest.
28
agents
85
multimodal

Science uses a reasoning proxy when dedicated science benchmarks are unavailable.

Rankings

Domain#RankScoreSource
Code Ranking338
23.0
AA
General Ranking360
30.0
AA
Math Reasoning254
33.0
AA
Multimodal Ranking31
81.0
LS
Science379
29.0
AA

Benchmark Scores (LLM Stats)

Biology

GPQA46.7%SR

Finance

MMLU86.0%SR

General

MMMU60.3%SR
MMMU-Pro45.2%SR

Image To Text

DocVQA90.1%SR
VQAv278.1%SR
TextVQA73.5%SR

Math

MGSM86.9%SR
MATH68.0%SR
MathVista57.3%SR

Multimodal

AI2D92.3%SR
ChartQA85.5%SR
InfographicsQA56.8%SR

AA Evaluation Indices

Intelligence Index
6.2
Mmlu Pro
0.7
Math 500
0.6
Gpqa
0.4
Scicode
0.2
Livecodebench
0.2
Aime
0.1
Hle
0.0

LLM Stats Category Scores

Language
90
Legal
90
Finance
90
Image To Text
80
Math
70
Multimodal
70
Reasoning
70
Healthcare
70
Vision
70
General
60
Physics
50
Biology
50
Chemistry
50

Pricing

Input Price$1.38 / 1M tokens
Output Price$1.38 / 1M tokens
Blended Price (3:1)$1.38 / 1M tokens

Speed

Tokens/sec55.0
Time to First Token0.60s
Time to Answer0.60s

Provider Price Ranking

Provider Price Ranking

5 providers

Cheapest: IO.NETMost Expensive: Azure
ProviderInputOutput
1IO.NETCheapest
$0.35
$0.4
2Vercel AI Gateway
$0.72
$0.72
3MetaPRIMARY
$1.38
$1.38
4Azure Cognitive Services
$2.04
$2.04
5Azure
$2.04
$2.04

Compare pricing across different API providers for this model.

External Sources