Overview
On June 20, 2024, Anthropic released Claude 3.5 Sonnet — a model positioned between the lightweight Claude 3 Haiku and the flagship Claude 3 Opus, but which outperformed both on nearly every metric that mattered to developers.
What set Claude 3.5 Sonnet apart was not a single benchmark headline but practical coding performance. On SWE-bench (Software Engineering, a benchmark testing a model’s ability to resolve real GitHub issues), Claude 3.5 Sonnet scored 49% — nearly doubling the 15% achieved by its predecessor and leaving GPT-4 (≈15%) far behind.
The “Medium Tier” That Beat the Flagships
The most striking thing about Claude 3.5 Sonnet was its efficiency: a model that cost less and ran faster than Claude 3 Opus, yet consistently surpassed it. This directly challenged the assumption that “bigger = better coding.”
Developers reported:
- Claude 3.5 Sonnet could read and modify large codebases (10,000+ lines) with minimal hallucination
- It could explain legacy code in seconds — a task that previously required hours of human archaeology
- Agentic coding workflows (plan → edit → test → fix) became genuinely viable for the first time
Impact
Claude 3.5 Sonnet’s release catalyzed the “AI coding agent” arms race. Within months:
- GitHub Copilot was upgraded with Claude-powered features
- Cursor AI (built around Claude) gained significant market share
- Devin (Cognition AI’s autonomous coder) launched as a direct competitor
- Amazon invested $4B in Anthropic, partially motivated by Claude’s coding capabilities
Significance
Claude 3.5 Sonnet established a key principle: for coding tasks, specialized fine-tuning and context handling mattered more than raw parameter count. This became the foundation for the agentic AI wave of 2024-2025.