AI Leaderboard

AI Models & Tools Benchmark

Real-time tracking of global AI model benchmarks and rankings.

LLM Benchmarks

RankModelCode ArenaChat ArenaGPQAPricing (In/Out)
#1
Claude Opus 4.6
Anthropic
Ctx: 200kModel
2002
1476
91.3
$5 / $25
#2
Gemini 3.1 Pro
Google
Ctx: 1.0MModel
1855
1222
94.3
$2.5 / $15
#3
Claude Opus 4.5
Anthropic
Ctx: 200kModel
1582
1345
87
$5 / $25
#4
Gemini 3 Flash
Google
Ctx: 1.0MModel
1581
1172
90.4
$0.5 / $3
#5
Gemini 3 Pro
Google
Ctx: N/AModel
1579
1045
91.9
N/A
#6
GPT-5.2
OpenAI
Ctx: 400kModel
1505
1172
92.4
$1.75 / $14
#7
Claude Sonnet 4.6
Anthropic
Ctx: 200kModel
1371
941
89.9
$3 / $15
#8
Qwen3.5-397B-A17B
Alibaba Cloud / Qwen Team
Ctx: 262kModel
1214
1067
88.4
$0.6 / $3.6
#9
Qwen3.5-122B-A10B
Alibaba Cloud / Qwen Team
Ctx: 262kModel
1136
-
86.6
$0.4 / $3.2
#10
Claude Sonnet 4.5
Anthropic
Ctx: 200kModel
1103
1294
83.4
$3 / $15
#11
Claude Opus 4.1
Anthropic
Ctx: 200kModel
1026
1183
80.9
$15 / $75
#12
Gemini 3.1 Flash-Lite
Google
Ctx: 1.0MModel
979
328
86.9
$0.25 / $1.5
#13
Claude Opus 4
Anthropic
Ctx: 200kModel
932
1088
79.6
$15 / $75
#14
GPT-4.1 mini
OpenAI
Ctx: 1.0MModel
917
528
65
$0.4 / $1.6
#15
Claude Sonnet 4
Anthropic
Ctx: 200kModel
889
856
75.4
$3 / $15
#16
Claude Haiku 4.5
Anthropic
Ctx: 200kModel
797
1176
73
$1 / $5
#17
GPT-4.1
OpenAI
Ctx: 1.0MModel
656
1219
66.3
$2 / $8
#18
Claude 3.7 Sonnet
Anthropic
Ctx: 200kModel
632
892
84.8
$3 / $15
#19
Mistral Large 3 (675B Instruct 2512)
Mistral AI
Ctx: 262kModel
625
1078
43.9
$0.5 / $1.5
#20
GPT OSS 120B High
OpenAI
Ctx: 131kModel
528
974
80.9
$0.1 / $0.5