| MODEL | mmlu | humaneval | gpqa | CONTEXT |
|---|---|---|---|---|
GPT-4o OpenAI | 88.7% | 90.2% | 53.6% | 128K |
GPT-4o mini OpenAI | 82% | 87.2% | — | 128K |
o1 OpenAI | 92.3% | 92.4% | 75.7% | 200K |
Claude 3.5 Sonnet Anthropic | 88.3% | 93.7% | 65% | 200K |
Claude 3.5 Haiku Anthropic | 82% | 87.1% | — | 200K |
Gemini 1.5 Pro | 85.9% | 84.1% | — | 2000K |
Gemini 1.5 Flash | 78.9% | 74.4% | — | 1000K |
OpenAI
API: $2.5/1M in · $10/1M out
OpenAI
API: $0.15/1M in · $0.6/1M out
OpenAI
API: $15/1M in · $60/1M out
Anthropic
API: $3/1M in · $15/1M out
Anthropic
API: $0.8/1M in · $4/1M out
API: $3.5/1M in · $10.5/1M out
API: $0.075/1M in · $0.3/1M out