A composite coding capability score by Artificial Analysis. Aggregates LiveCodeBench, SWE-bench, Aider, and other real-world coding benchmarks.
ChatGPT
GPT-5.4
Google
Gemini 3.1 Pro
Anthropic
Claude Opus 4.6
Claude Opus 4.5
Meta AI
Muse Spark
Z.ai
GLM-5
Claude Sonnet 4.6
Qwen
Qwen3.6 Plus
Gemini 3 Flash
MiniMax
MiniMax M2.7
Xiaomi
MiMo-V2-Pro
Grok
Grok 4.20 (Reasoning)
Moonshot AI
Kimi K2.5
Gemma 4 31B
Claude Sonnet 4.5
GPT-5.4 Mini
MiniMax M2.5
Qwen3.5 397B A17B
DeepSeek
DeepSeek V3.2
Claude Opus 4.1
GPT-5.4 Nano
Claude Sonnet 4
Claude Opus 4
Claude Haiku 4.5
Gemini 2.5 Pro
NVIDIA
Nemotron 3 Super
Grok 4.1 Fast (Reasoning)
Gemini 3.1 Flash Lite
Baidu
ERNIE 5.0 Thinking
GPT-5 Nano
Gemini 2.5 Flash
Grok 4.20
GPT-5 Mini
GPT-4.1
GPT-5
Grok 4.1 Fast
Meituan
Longcat Flash Chat
Llama 4 Maverick
GPT OSS 120B
ERNIE 4.5 300B A47B
Gemini 2.5 Flash Lite
Amazon
Nova 2 Lite
Llama 4 Scout