Grok-1.5V

xAIGrokProprietary

Description

A multimodal model capable of processing text and visual information, including documents, diagrams, charts, screenshots, and photographs. Notable for strong real-world spatial understanding capabilities.

Date de sortie

2024-04-12

Paramètres

—

Longueur du contexte

—

Modalités

—

Radar de capacités

general

coding

reasoning

scienceest.

agents

multimodal

Science utilise un proxy de raisonnement lorsque les benchmarks scientifiques dédiés ne sont pas disponibles.

Classements

Domaine	#Rang	Score	Source
Multimodal Ranking	26	82.0	LS

Scores de benchmarks (LLM Stats)

General

MMMU

53.6%Aut.

Image To Text

DocVQA

85.6%Aut.

TextVQA

78.1%Aut.

Math

MathVista

52.8%Aut.

Multimodal

AI2D

88.3%Aut.

ChartQA

76.1%Aut.

Spatial Reasoning

RealWorldQA

68.7%Aut.

Indices d'évaluation AA

Aucune donnée d'évaluation AA disponible

Scores par catégorie LLM Stats

Image To Text

Spatial Reasoning

Vision

Multimodal

Reasoning

General

Healthcare

Math

Tarification

Aucune donnée de prix disponible

Vitesse

Aucune donnée de vitesse disponible

Fournisseurs disponibles

(Unités internes LS)

Aucune donnée de fournisseur disponible

Sources externes

LLM Stats