Claude
Claude

Claude Sonnet 4

Model ID:claude-sonnet-4-20250514
2025-05-22Proprietary Model
API
OverallNo.18
PopularityNo.94

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios. Read more at the [blog post here](https://www.anthropic.com/news/claude-4)

Knowledge Cutoff
2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory
200KIN64KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words
$3IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

AI Performance Evaluation

Arena Overall Score
1399
±4
As of 2026-04-02
Overall Rank
No.94
35,417 Votes
Arena by Ability
Hard Prompts
1431±6No.75
Expert Knowledge
1435±14No.66
Instruction Following
1414±7No.58
Conversation Memory
1419±8No.66
Creative
1395±9No.59
Coding
1473±8No.50
Math
1402±13No.88
Arena by Occupation
Creative Writing
1397±7No.68
Social Sciences
1418±8No.87
Media
1388±8No.66
Business
1385±8No.107
Healthcare
1420±13No.95
Legal
1409±13No.87
Software
1443±6No.77
Mathematics
1409±13No.91
Reasoning Ability
AA Intelligence Index
39%↑0%
MMLU-Pro
84%↑2%
GPQA Diamond
78%↓4%
HLE
9.6%↓7%
Math
AA Math Index
74%↑0%
MATH-500
99%↑5%
AIME 2024
77%↑18%
AIME 2025
74%↑0%
Coding Ability
AA Coding Index
34%↓2%
LiveCodeBench
66%↑0%
SciCode
40%↓2%
TerminalBench
31%↓3%
Instruction Following
IFBench
55%↓3%
환각률 (HHEM)
10%↑0%
사실 일관성 (HHEM)
90%↑0%
Long Context
AA-LCR
65%↑1%
Agentic AI Ability
TAU2
65%↓7%
Speed
Standard Mode
45tok/sec↓33
First Output 0.80s
Artificial Analysis
Reasoning Mode
46tok/sec↓27
First Output 8.30s
Artificial Analysis