AI Leaderboard

AI Models & Tools Benchmark

Real-time tracking of global AI model benchmarks and rankings.

LLM Benchmarks

RankModelCode ArenaChat ArenaGPQAPricing (In/Out)
#1
Claude Opus 4.6
Anthropic
Ctx: 1.0MModel
1994
1491
91.3
$5 / $25
#2
Gemini 3.1 Pro
Google
Ctx: 1.0MModel
1918
1222
94.3
$2.5 / $15
#3
Claude Opus 4.5
Anthropic
Ctx: 200kModel
1590
1345
87
$5 / $25
#4
Gemini 3 Pro
Google
Ctx: N/AModel
1579
1045
91.9
N/A
#5
Gemini 3 Flash
Google
Ctx: 1.0MModel
1576
1172
90.4
$0.5 / $3
#6
GPT-5.2
OpenAI
Ctx: 400kModel
1506
1180
92.4
$1.75 / $14
#7
Claude Sonnet 4.6
Anthropic
Ctx: 200kModel
1382
941
89.9
$3 / $15
#8
Qwen3.5-397B-A17B
Alibaba Cloud / Qwen Team
Ctx: 262kModel
1217
1067
88.4
$0.6 / $3.6
#9
Qwen3.5-122B-A10B
Alibaba Cloud / Qwen Team
Ctx: 262kModel
1181
1965
86.6
$0.4 / $3.2
#10
Claude Sonnet 4.5
Anthropic
Ctx: 200kModel
1103
1294
83.4
$3 / $15
#11
Gemma 4 26B-A4B
Google
Ctx: 262kModel
1056
-
82.3
$0.13 / $0.4
#12
Claude Opus 4.1
Anthropic
Ctx: 200kModel
1043
1183
80.9
$15 / $75
#13
Gemini 3.1 Flash-Lite
Google
Ctx: 1.0MModel
972
328
86.9
$0.25 / $1.5
#14
Claude Opus 4
Anthropic
Ctx: 200kModel
932
1088
79.6
$15 / $75
#15
GPT-4.1 mini
OpenAI
Ctx: 1.0MModel
917
528
65
$0.4 / $1.6
#16
Claude Sonnet 4
Anthropic
Ctx: 200kModel
882
856
75.4
$3 / $15
#17
Claude Haiku 4.5
Anthropic
Ctx: 200kModel
807
1176
73
$1 / $5
#18
Qwen3.5-27B
Alibaba Cloud / Qwen Team
Ctx: 262kModel
675
1288
85.5
$0.3 / $2.4
#19
GPT-4.1
OpenAI
Ctx: 1.0MModel
653
1219
66.3
$2 / $8
#20
Claude 3.7 Sonnet
Anthropic
Ctx: 200kModel
632
892
84.8
$3 / $15