Gemini 1.5 Pro (May '24)

GoogleGemini

描述

Gemini 1.5 Pro is a mid-size multimodal model optimized for a wide range of reasoning tasks. It can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text.

发布日期

2024-05-15

参数规模

—

上下文长度

—

支持模态

—

能力雷达图

general

coding

reasoning

science估算

agents

multimodal

Science 在缺少专门科学评测时使用推理能力代理估算。

排行榜排名

领域	#排名	分数	来源
代码能力榜	322	25.0	AA
通用能力榜	369	30.0	AA
数学推理	238	37.0	AA
多模态榜	37	79.0	LS
推理能力	4	93.0	LS
科学能力	393	28.0	AA

基准测试分数 (LLM Stats)

Biology

GPQA

59.1%自报

Code

HumanEval

84.1%自报

Finance

MMLU

85.9%自报

MMLU-Pro

75.8%自报

General

Natural2Code

85.4%自报

MRCR

82.6%自报

MMMU

65.9%自报

Vibe-Eval

53.9%自报

Healthcare

WMT23

75.1%自报

Language

FLEURS

93.3%自报

BIG-Bench Hard

89.2%自报

Math

GSM8k

90.8%自报

MGSM

87.5%自报

MATH

86.5%自报

DROP

74.9%自报

MathVista

68.1%自报

FunctionalMATH

64.6%自报

PhysicsFinals

63.9%自报

HiddenMath

52.0%自报

AMC_2022_23

46.4%自报

Multimodal

Video-MME

78.6%自报

Reasoning

HellaSwag

93.3%自报

Safety

XSTest

98.8%自报

AA 评测指数

Coding Index

19.8

Intelligence Index

6.3

Math 500

0.7

Mmlu Pro

0.7

Gpqa

0.4

Scicode

0.3

Livecodebench

0.2

Aime

0.1

Hle

0.0

LLM Stats 分类评分

Safety

100

Speech To Text

Language

Legal

Long Context

Math

Reasoning

Finance

Healthcare

Code

Multimodal

General

Vision

Physics

Biology

Chemistry

定价

输入价格免费

输出价格免费

混合价格(3:1)免费

速度

Tokens/秒0.0

首Token延迟0.00s

首回答延迟0.00s

供应商价格排行

暂无提供商数据

外部链接

Artificial Analysis