Claude 3.7 Sonnet: Anthropic's Extended Thinking Model

Overview

On February 24, 2025, Anthropic released Claude 3.7 Sonnet — the first production model to make “extended thinking” a first-class, user-facing feature.

Unlike o1/o3’s internal chain-of-thought reasoning (hidden from users), Claude 3.7 Sonnet’s thinking process was visible and configurable. Users could set a “thinking budget” — from 1,000 to 64,000 tokens — and watch the model reason through complex problems in real time.

The “10x Developer” Moment

Claude 3.7 Sonnet’s standout capability was agentic coding at human scale:

Could plan and execute a multi-step refactoring across a 50,000-line codebase
Could write, run, and debug tests in a single conversation
Maintained context across dramatically longer interactions than previous models

On SWE-bench Verified (the authoritative version of the coding benchmark), Claude 3.7 Sonnet scored 62.3% — a significant jump from Claude 3.5 Sonnet’s 49%.

Significance

Claude 3.7 Sonnet’s release had two lasting impacts:

“Thinking budget” became a standard feature across AI products — users learned to allocate compute for complex tasks just as they allocated memory for running programs
It established Anthropic’s identity as “the coding model” — even as OpenAI competed on benchmarks, Anthropic competed on developer experience and code quality