AI Comparison

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

Author

Claude

Release Date

2025-05-22

Knowledge Cutoff

2025-01-31

License

Proprietary

I/O Format

Context Length

200K / 64K

API I/O (1M)

$3 / $15

How to Use

API Access

Output Speed

45 tok/s

Arena Overall

1399

Intelligence Index

38.7

Coding Index

34.1

Math Index

74.3

LiveBench

—

ForecastBench

58.6

GPQA Diamond

77.7%

HLE

9.6%

MMLU-Pro

84.2%

AIME 2025

74.3%

MATH-500

99.1%

LB Reasoning

—

LB Math

—

LB Data Analysis

—

LiveCodeBench

65.5%

LB Coding

—

LB Agentic

—

TAU2

64.6%

TerminalBench

31.1%

SciCode

40.0%

IFBench

54.7%

AA-LCR

0.6

Hallucination (HHEM)

10.3%

Factual Consistency (HHEM)

89.7%

LB Language

—

LB Instruction Following

—

View Model Details

1 / 3

좌우로 스와이프하여 전환