Real-time tracking of global AI model benchmarks and rankings.
| Rank | Model | Code Arena | Chat Arena | GPQA | Pricing (In/Out) |
|---|---|---|---|---|---|
| #1 | Claude Opus 4.6 Anthropic Ctx: 1.0MModel | 2101 | 1491 | 91.3 | $5 / $25 |
| #2 | Gemini 3.1 Pro Google Ctx: 1.0MModel | 2076 | 1222 | 94.3 | $2.5 / $15 |
| #3 | Claude Opus 4.7 Anthropic Ctx: 1.0MModel | 1923 | 358 | 94.2 | $5 / $25 |
| #4 | Gemini 3 Flash Google Ctx: 1.0MModel | 1703 | 1143 | 90.4 | $0.5 / $3 |
| #5 | Claude Sonnet 4.6 Anthropic Ctx: 200kModel | 1697 | 956 | 89.9 | $3 / $15 |
| #6 | Claude Opus 4.5 Anthropic Ctx: N/AModel | 1614 | 1342 | 87 | N/A |
| #7 | Gemini 3 Pro Google Ctx: N/AModel | 1579 | 1045 | 91.9 | N/A |
| #8 | GPT-5.2 OpenAI Ctx: 400kModel | 1519 | 1170 | 92.4 | $1.75 / $14 |
| #9 | Qwen3.5-397B-A17B Alibaba Cloud / Qwen Team Ctx: 262kModel | 1289 | 963 | 88.4 | $0.6 / $3.6 |
| #10 | Gemma 4 26B-A4B Google Ctx: 262kModel | 1251 | 594 | 82.3 | $0.13 / $0.4 |
| #11 | Claude Sonnet 4.5 Anthropic Ctx: 200kModel | 1248 | 1308 | 83.4 | $3 / $15 |
| #12 | Claude Opus 4.1 Anthropic Ctx: N/AModel | 1189 | 1180 | 80.9 | N/A |
| #13 | Qwen3.6 Plus Alibaba Cloud / Qwen Team Ctx: 1.0MModel | 1162 | 750 | 90.4 | $0.5 / $3 |
| #14 | Gemma 4 31B Google Ctx: 262kModel | 1134 | 881 | 84.3 | $0.14 / $0.4 |
| #15 | GPT-4.1 mini OpenAI Ctx: 1.0MModel | 1043 | 528 | 65 | $0.4 / $1.6 |
| #16 | Gemini 3.1 Flash-Lite Google Ctx: 1.0MModel | 977 | 756 | 86.9 | $0.25 / $1.5 |
| #17 | Claude Haiku 4.5 Anthropic Ctx: 200kModel | 948 | 1188 | 73 | $1 / $5 |
| #18 | Claude Opus 4 Anthropic Ctx: N/AModel | 932 | 1088 | 79.6 | N/A |
| #19 | Claude Sonnet 4 Anthropic Ctx: N/AModel | 882 | 856 | 75.4 | N/A |
| #20 | GPT-4.1 OpenAI Ctx: 1.0MModel | 842 | 1237 | 66.3 | $2 / $8 |
Anthropic's terminal Agent tool. Excel at understanding complex logic, debugging, and executing Shell commands. Extremely hardcore.
AI-native editor known for deep context understanding and Agent mode. Supports multi-model switching (Claude 3.5/GPT-4).
Google's Agent-first IDE with built-in Manager & Editor views. Capable of autonomously completing complex engineering tasks via AI Agents.
Core model powering Copilot, offering raw API access. Excel at translating natural language to code with multi-language support.
Amazon's AI IDE based on Spec-driven development. Generates requirement specs before implementing code, excel at structured development for complex projects.
Industry standard AI assistant, deeply integrated into VS Code/JetBrains. New Agent mode handles complex tasks autonomously.
Next-gen AI IDE focusing on Flow mode. Cascade engine supports multi-model collaboration and deep code understanding.