Claude
Claude

Claude Opus 4

Model ID:claude-opus-4-20250514
2025-05-22Proprietary Model
API
OverallNo.17
PopularityNo.56

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.

Knowledge Cutoff
2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory
200KIN32KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words
$15IN$75OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

AI Performance Evaluation

Arena Overall Score
1424
±4
As of 2026-04-02
Overall Rank
No.56
37,191 Votes
Arena by Ability
Hard Prompts
1456±6No.44
Expert Knowledge
1447±14No.50
Instruction Following
1442±7No.27
Conversation Memory
1437±8No.46
Creative
1431±9No.26
Coding
1498±8No.30
Math
1418±12No.61
Arena by Occupation
Creative Writing
1429±7No.30
Social Sciences
1440±8No.61
Media
1420±8No.31
Business
1412±8No.71
Healthcare
1447±13No.56
Legal
1435±12No.56
Software
1467±6No.44
Mathematics
1423±13No.63
Reasoning Ability
AA Intelligence Index
39%↑0%
MMLU-Pro
87%↑5%
GPQA Diamond
80%↓2%
HLE
12%↓5%
Math
AA Math Index
73%↓1%
MATH-500
98%↑4%
AIME 2024
76%↑16%
AIME 2025
73%↓1%
Coding Ability
AA Coding Index
34%↓2%
LiveCodeBench
64%↓2%
SciCode
40%↓2%
TerminalBench
31%↓3%
Instruction Following
IFBench
54%↓4%
환각률 (HHEM)
12%↑1%
사실 일관성 (HHEM)
88%↓1%
Long Context
AA-LCR
34%↓30%
Agentic AI Ability
TAU2
73%↑2%
Speed
Standard Mode
34tok/sec↓44
First Output 1.33s
Artificial Analysis
Reasoning Mode
36tok/sec↓37
First Output 7.11s
Artificial Analysis