| 순위 | 모델 | |
|---|---|---|
| #1 | OpenAI GPT-5.4 | 79.3 |
| #2 | Gemini 3.1 Pro | 78.5 |
| #3 | Anthropic Claude Sonnet 4.6 | 74.6 |
| #4 | Qwen Qwen3.6 Plus | 69.9 |
| #5 | Anthropic Claude Opus 4.6 | 69.9 |
| #6 | Z.ai GLM-5 | 67.9 |
| #7 | Z.ai GLM-5.1 | 63.2 |
| #8 | Moonshot AI Kimi K2.5 | 61.4 |
| #9 | Gemma 4 31B | 58.8 |
| #10 | MiniMax MiniMax M2.7 | 56.3 |
| #11 | Gemini 3.1 Flash Lite | 54.9 |
| #12 | Gemini 2.5 Pro | 51.6 |
| #13 | OpenAI GPT-5 Mini | 49.6 |
| #14 | MiniMax MiniMax M2.5 | 49.6 |
| #15 | Xiaomi MiMo-V2-Pro | 49.2 |
| #16 | Gemini 3 Flash | 48.3 |
| #17 | OpenAI GPT-5.4 Mini | 47.4 |
| #18 | Gemini 2.5 Flash | 47.3 |
| #19 | Gemini 2.5 Flash Lite | 47.0 |
| #20 | Anthropic Claude Sonnet 4.5 | 47.0 |
| #21 | Anthropic Claude Haiku 4.5 | 45.1 |
| #22 | DeepSeek DeepSeek V3.2 | 45.0 |
| #23 | OpenAI GPT-5 Nano | 44.3 |
| #24 | Anthropic Claude Opus 4.5 | 44.2 |
| #25 | OpenAI GPT-5.4 Nano | 39.1 |
| #26 | OpenAI GPT OSS 120B | 38.8 |