[valueai]
MODEL INTELLIGENCE

AI Models

Free & Open Source →

BENCHMARK COMPARISON

MODELmmluhumanevalgpqaCONTEXT

GPT-4o

OpenAI

88.7%90.2%53.6%128K

GPT-4o mini

OpenAI

82%87.2%128K

o1

OpenAI

92.3%92.4%75.7%200K

Claude 3.5 Sonnet

Anthropic

88.3%93.7%65%200K

Claude 3.5 Haiku

Anthropic

82%87.1%200K

Gemini 1.5 Pro

Google

85.9%84.1%2000K

Gemini 1.5 Flash

Google

78.9%74.4%1000K

MODEL DETAILS

GPT-4o

OpenAI

128K ctx
MMLU88.7%
HUMANEVAL90.2%
GPQA53.6%
VisionCodeFunction callingJSON mode

API: $2.5/1M in · $10/1M out

GPT-4o mini

OpenAI

128K ctx
MMLU82%
HUMANEVAL87.2%
VisionCodeFunction calling

API: $0.15/1M in · $0.6/1M out

o1

OpenAI

200K ctx
MMLU92.3%
HUMANEVAL92.4%
GPQA75.7%
ReasoningCodeMathScience

API: $15/1M in · $60/1M out

Claude 3.5 Sonnet

Anthropic

200K ctx
MMLU88.3%
HUMANEVAL93.7%
GPQA65%
VisionCodeFunction callingLong context

API: $3/1M in · $15/1M out

Claude 3.5 Haiku

Anthropic

200K ctx
MMLU82%
HUMANEVAL87.1%
VisionCodeFunction calling

API: $0.8/1M in · $4/1M out

Gemini 1.5 Pro

Google

2000K ctx
MMLU85.9%
HUMANEVAL84.1%
VisionCodeLong contextMultimodal

API: $3.5/1M in · $10.5/1M out

Gemini 1.5 Flash

Google

1000K ctx
MMLU78.9%
HUMANEVAL74.4%
VisionCodeLong context

API: $0.075/1M in · $0.3/1M out