AI Comparison

1 / 3
좌우로 스와이프하여 전환

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.

Author
AnthropicAnthropic
Release Date
2025-05-22
Knowledge Cutoff
2025-05-01
License
Proprietary
I/O Format
Context Length
1M / 128K
API I/O (1M)
$15 / $75
How to Use
API Access
Output Speed
34 tok/s
Arena Overall
1424
Intelligence Index
39.0
Coding Index
34.0
Math Index
73.3
LiveBench
ForecastBench
60.6
GPQA Diamond
79.6%
HLE
11.7%
MMLU-Pro
87.3%
AIME 2025
73.3%
MATH-500
98.2%
LB Reasoning
LB Math
LB Data Analysis
LiveCodeBench
63.6%
LB Coding
LB Agentic
TAU2
73.4%
TerminalBench
31.1%
SciCode
39.8%
IFBench
53.7%
AA-LCR
0.3
Hallucination (HHEM)
12.0%
Factual Consistency (HHEM)
88.0%
LB Language
LB Instruction Following