ChatGPT
ChatGPT

GPT OSS 120B

Model ID:openai/gpt-oss-120b
2025-08-05Open Model
API
OverallNo.28
PopularityNo.138

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

Knowledge Cutoff
2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory
131KIN131KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words
$0.039IN$0.19OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

AI Performance Evaluation

Arena Overall Score
1354
±4
As of 2026-04-02
Overall Rank
No.138
30,916 Votes
Arena by Ability
Hard Prompts
1363±6No.146
Expert Knowledge
1362±16No.135
Instruction Following
1326±7No.153
Conversation Memory
1328±9No.161
Creative
1280±10No.192
Coding
1391±8No.145
Math
1384±14No.113
Arena by Occupation
Creative Writing
1311±8No.166
Social Sciences
1362±9No.150
Media
1287±8No.175
Business
1349±8No.144
Healthcare
1368±15No.142
Legal
1344±14No.161
Software
1386±6No.143
Mathematics
1385±15No.114
Reasoning Ability
AA Intelligence Index
25%↓14%
MMLU-Pro
78%↓5%
GPQA Diamond
67%↓15%
HLE
5.2%↓11%
Math
AA Math Index
67%↓8%
AIME 2025
67%↓8%
Coding Ability
AA Coding Index
16%↓21%
LiveCodeBench
71%↑5%
SciCode
36%↓6%
TerminalBench
5.3%↓29%
Instruction Following
IFBench
58%↑1%
환각률 (HHEM)
14%↑3%
사실 일관성 (HHEM)
86%↓3%
Long Context
AA-LCR
44%↓20%
Agentic AI Ability
TAU2
45%↓26%
Speed
Standard Mode
86tok/sec↑8
First Output 0.48s
OpenRouter
Reasoning Mode
234tok/sec↑161
First Output 9.07s
Artificial Analysis