ChatGPT

GPT OSS 120B

Model ID:openai/gpt-oss-120b

2025-08-05Open Model

API

OverallNo.28

PopularityNo.138

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

Knowledge Cutoff

2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

131KIN131KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.039IN$0.19OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1354

±4

As of 2026-04-02

Overall Rank

No.138

30,916 Votes

Arena by Ability

Hard Prompts

1363±6No.146

Expert Knowledge

1362±16No.135

Instruction Following

1326±7No.153

Conversation Memory

1328±9No.161

Creative

1280±10No.192

Coding

1391±8No.145

Math

1384±14No.113

Arena by Occupation

Creative Writing

1311±8No.166

Social Sciences

1362±9No.150

Media

1287±8No.175

Business

1349±8No.144

Healthcare

1368±15No.142

Legal

1344±14No.161

Software

1386±6No.143

Mathematics

1385±15No.114

Source:Arena Intelligence

Reasoning Ability

AA Intelligence Index

25%↓14%

MMLU-Pro

78%↓5%

GPQA Diamond

67%↓15%

HLE

5.2%↓11%

Math

AA Math Index

67%↓8%

AIME 2025

67%↓8%

Coding Ability

AA Coding Index

16%↓21%

LiveCodeBench

71%↑5%

SciCode

36%↓6%

TerminalBench

5.3%↓29%

Instruction Following

IFBench

58%↑1%

환각률 (HHEM)

14%↑3%

사실 일관성 (HHEM)

86%↓3%

Long Context

AA-LCR

44%↓20%

Agentic AI Ability

TAU2

45%↓26%

Speed

Standard Mode

86tok/sec↑8

First Output 0.48s

OpenRouter

Reasoning Mode

234tok/sec↑161

First Output 9.07s

Artificial Analysis

Source:Artificial Analysis Vectara HHEM OpenRouter

← Back to AI Models