Claude

Claude Opus 4

Model ID:claude-opus-4-20250514

2025-05-22Proprietary Model

API

OverallNo.17

PopularityNo.56

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

200KIN32KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$15IN$75OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1424

±4

As of 2026-04-02

Overall Rank

No.56

37,191 Votes

Arena by Ability

Hard Prompts

1456±6No.44

Expert Knowledge

1447±14No.50

Instruction Following

1442±7No.27

Conversation Memory

1437±8No.46

Creative

1431±9No.26

Coding

1498±8No.30

Math

1418±12No.61

Arena by Occupation

Creative Writing

1429±7No.30

Social Sciences

1440±8No.61

Media

1420±8No.31

Business

1412±8No.71

Healthcare

1447±13No.56

Legal

1435±12No.56

Software

1467±6No.44

Mathematics

1423±13No.63

Source:Arena Intelligence

Reasoning Ability

AA Intelligence Index

39%↑0%

MMLU-Pro

87%↑5%

GPQA Diamond

80%↓2%

HLE

12%↓5%

Math

AA Math Index

73%↓1%

MATH-500

98%↑4%

AIME 2024

76%↑16%

AIME 2025

73%↓1%

Coding Ability

AA Coding Index

34%↓2%

LiveCodeBench

64%↓2%

SciCode

40%↓2%

TerminalBench

31%↓3%

Instruction Following

IFBench

54%↓4%

환각률 (HHEM)

12%↑1%

사실 일관성 (HHEM)

88%↓1%

Long Context

AA-LCR

34%↓30%

Agentic AI Ability

TAU2

73%↑2%

Speed

Standard Mode

34tok/sec↓44

First Output 1.33s

Artificial Analysis

Reasoning Mode

36tok/sec↓37

First Output 7.11s

Artificial Analysis

Source:Artificial Analysis Vectara HHEM

← Back to AI Models