Grok-1.5V

xAIGrokProprietary

説明

A multimodal model capable of processing text and visual information, including documents, diagrams, charts, screenshots, and photographs. Notable for strong real-world spatial understanding capabilities.

リリース日

2024-04-12

パラメータ

—

コンテキスト長

—

モダリティ

—

能力レーダー

general

coding

reasoning

science推定

agents

multimodal

専門的な科学ベンチマークが利用できない場合、Scienceは推論プロキシを使用して推定します。

ベンチマークスコア (LLM Stats)

General

MMMU

53.6%自己申告

Image To Text

DocVQA

85.6%自己申告

TextVQA

78.1%自己申告

Math

MathVista

52.8%自己申告

Multimodal

AI2D

88.3%自己申告

ChartQA

76.1%自己申告

Spatial Reasoning

RealWorldQA

68.7%自己申告

AA評価指数

AA評価データがありません

LLM Statsカテゴリスコア

Image To Text

Spatial Reasoning

Vision

Multimodal

Reasoning

General

Healthcare

Math

価格設定

価格データがありません

速度

速度データがありません

利用可能なプロバイダー

(LS内部単位)

プロバイダーデータがありません

外部リンク

LLM Stats

説明

能力レーダー

ランキング

ベンチマークスコア (LLM Stats)

General

Image To Text

Math

Multimodal

Spatial Reasoning

AA評価指数

LLM Statsカテゴリスコア

価格設定

速度

利用可能なプロバイダー

外部リンク