ChatGPT

GPT-4.1

Model ID:gpt-4.1-2025-04-14

2025-04-14Proprietary Model

API

OverallNo.26

PopularityNo.197

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Knowledge Cutoff

2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.0MIN33KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$2IN$8OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1312

±4

As of 2026-04-02

Overall Rank

No.197

100,105 Votes

Arena by Ability

Hard Prompts

1311±6No.204

Expert Knowledge

1285±12No.196

Instruction Following

1293±6No.195

Conversation Memory

1297±8No.196

Creative

1286±8No.185

Coding

1338±7No.204

Math

1302±8No.174

Arena by Occupation

Creative Writing

1306±6No.177

Social Sciences

1322±8No.202

Media

1289±8No.172

Business

1282±9No.216

Healthcare

1307±12No.202

Legal

1316±11No.205

Software

1324±6No.211

Mathematics

1308±8No.174

Source:Arena Intelligence

Reasoning Ability

AA Intelligence Index

26%↓13%

MMLU-Pro

81%↓2%

GPQA Diamond

67%↓15%

HLE

4.6%↓12%

Math

AA Math Index

35%↓40%

MATH-500

91%↓3%

AIME 2024

44%↓16%

AIME 2025

35%↓40%

Coding Ability

AA Coding Index

22%↓15%

LiveCodeBench

46%↓20%

SciCode

38%↓4%

TerminalBench

14%↓20%

Instruction Following

IFBench

43%↓14%

Long Context

AA-LCR

61%↓3%

Agentic AI Ability

TAU2

47%↓24%

Speed

Standard Mode

128tok/sec↑50

First Output 0.56s

Artificial Analysis

Source:Artificial Analysis

← Back to AI Models