Claude

Claude Sonnet 4

Model ID:claude-sonnet-4-20250514

2025-05-22Proprietary Model

API

OverallNo.18

PopularityNo.94

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios. Read more at the [blog post here](https://www.anthropic.com/news/claude-4)

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

200KIN64KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$3IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1399

±4

As of 2026-04-02

Overall Rank

No.94

35,417 Votes

Arena by Ability

Hard Prompts

1431±6No.75

Expert Knowledge

1435±14No.66

Instruction Following

1414±7No.58

Conversation Memory

1419±8No.66

Creative

1395±9No.59

Coding

1473±8No.50

Math

1402±13No.88

Arena by Occupation

Creative Writing

1397±7No.68

Social Sciences

1418±8No.87

Media

1388±8No.66

Business

1385±8No.107

Healthcare

1420±13No.95

Legal

1409±13No.87

Software

1443±6No.77

Mathematics

1409±13No.91

Source:Arena Intelligence

Reasoning Ability

AA Intelligence Index

39%↑0%

MMLU-Pro

84%↑2%

GPQA Diamond

78%↓4%

HLE

9.6%↓7%

Math

AA Math Index

74%↑0%

MATH-500

99%↑5%

AIME 2024

77%↑18%

AIME 2025

74%↑0%

Coding Ability

AA Coding Index

34%↓2%

LiveCodeBench

66%↑0%

SciCode

40%↓2%

TerminalBench

31%↓3%

Instruction Following

IFBench

55%↓3%

환각률 (HHEM)

10%↑0%

사실 일관성 (HHEM)

90%↑0%

Long Context

AA-LCR

65%↑1%

Agentic AI Ability

TAU2

65%↓7%

Speed

Standard Mode

45tok/sec↓33

First Output 0.80s

Artificial Analysis

Reasoning Mode

46tok/sec↓27

First Output 8.30s

Artificial Analysis

Source:Artificial Analysis Vectara HHEM

← Back to AI Models