Overview
On August 7, 2025, OpenAI released GPT-5 — simultaneously available across ChatGPT (free and paid), the API, and GitHub Models. It was immediately made available at no cost to all ChatGPT users, with unlimited access for Pro subscribers.
Sam Altman described GPT-5 as “a significant step along the path to AGI” — the first time OpenAI’s CEO had used AGI framing in a product announcement, marking a milestone in how the company positioned its own technology publicly.
Benchmark Performance
GPT-5 represented the largest capability jump in a single model release since GPT-4 (2023):
| Benchmark | GPT-4o | GPT-5 |
|---|---|---|
| AIME 2025 | ~49% | 94.6% |
| SWE-bench Verified | ~49% | 74.9% |
| MMMU (multimodal) | 69.1% | 84.2% |
| Factual accuracy (web search) | baseline | ~45% fewer errors |
The AIME (American Invitational Mathematics Examination) score of 94.6% placed GPT-5 above the level of most PhD mathematicians on a test specifically designed for elite high school competitors.
Architecture and Capabilities
GPT-5 integrated multiple capabilities that had previously been separate:
- Unified reasoning and chat: No need to switch between “thinking mode” and “standard mode” — the model dynamically allocates reasoning compute based on task complexity
- Native multimodality: Text, image, audio, and video understanding in a single architecture
- Real-time web access: Significantly improved factual accuracy through tight integration with search
- Long-context understanding: Extended handling of long documents, codebases, and conversations
- Agentic capability: Deeper integration with tools and multi-step task execution
Context: A Year of Competitive Pressure
GPT-5 arrived after a year in which OpenAI’s dominance had been challenged:
- DeepSeek R1 (Jan 2025): proved frontier reasoning could be replicated cheaply
- Gemini 2.5 Pro (Mar 2025): led the LMArena leaderboard for weeks
- Claude 4 (May 2025): achieved 72.5% on SWE-bench Verified, the highest coding benchmark score to that point
- Internal delays: originally expected in early 2025, GPT-5 was pushed back multiple times for capability and safety refinement
GPT-5’s launch re-established OpenAI’s position at the frontier of publicly available models.
The AGI Question
Altman’s framing — “a significant step along the path to AGI” — reignited a debate the field had been building toward. Key positions:
Those who agreed the framing was appropriate:
- GPT-5’s performance on cognitive tests designed to resist AI (like ARC-AGI sub-tasks) suggested capabilities qualitatively beyond previous models
- The combination of reasoning, multimodality, and agentic action in a single system approached narrow definitions of AGI
Those who pushed back:
- “Step toward AGI” is unfalsifiable without a clear definition of AGI
- The model still fails on tasks trivial for humans (novel physical manipulation, true open-world common sense)
- The framing serves a commercial purpose — it raises stakes, justifies pricing, attracts talent
The debate itself was significant: it indicated that AI capability had crossed a threshold where mainstream discourse about AGI timelines was no longer fringe.