Google

Gemini 2.5 Flash Lite

Model ID:gemini-2.5-flash-lite

2025-09-25Proprietary Model

API

PopularityNo.116

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.0MIN66KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.1IN$0.4OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1380

±3

As of 2026-04-02

Overall Rank

No.116

47,593 Votes

Arena by Ability

Hard Prompts

1391±5No.123

Expert Knowledge

1385±12No.119

Instruction Following

1365±6No.122

Conversation Memory

1373±7No.119

Creative

1361±8No.99

Coding

1398±7No.138

Math

1365±11No.130

Arena by Occupation

Creative Writing

1371±6No.102

Social Sciences

1403±7No.108

Media

1346±7No.113

Business

1378±7No.114

Healthcare

1401±12No.112

Legal

1399±11No.103

Software

1400±5No.129

Mathematics

1370±13No.124

Source:Arena Intelligence

Speed

Standard Mode

105tok/sec↑27

First Output 0.53s

Artificial Analysis

Source:Artificial Analysis

← Back to AI Models