Overview
In October 2015, and announced publicly in January 2016, DeepMind’s AlphaGo defeated Fan Hui — the European Go champion — in five straight games. This was the first time a computer program had ever defeated a professional Go player without handicap.
The Go community, which had long regarded the game as uniquely immune to computational brute force, was stunned. Most experts had predicted this milestone was still a decade away.
Why Go Was Different
Chess had fallen to Deep Blue in 1997 through sheer computational power — evaluate enough positions fast enough, and even chess becomes tractable. Go is different in kind, not just degree:
- The board is larger: 19x19 versus 8x8 in chess, creating approximately 10^170 possible board positions (more than the atoms in the observable universe)
- Evaluation is harder: In chess, material advantage (counting pieces) is a reasonable proxy for position strength. In Go, evaluating a position requires holistic pattern recognition that experts struggle to articulate
- Brute force fails: Even evaluating one trillion positions per second, a computer could not search Go’s game tree meaningfully
This is why Go had resisted all previous AI approaches. The breakthrough required something qualitatively different.
How AlphaGo Worked
AlphaGo combined several innovations:
1. Deep neural networks: Two networks — a policy network (which moves to consider) and a value network (how good is this position?) — trained on millions of human expert games
2. Reinforcement learning: AlphaGo then played millions of games against itself, using the outcomes to improve its policy and value networks far beyond human expert level
3. Monte Carlo Tree Search: Rather than exhaustive search, AlphaGo used statistical sampling to efficiently evaluate promising lines of play
The combination was elegant: human games provided the initial training signal, self-play refined it to superhuman levels, and tree search integrated everything at decision time.
The Match and Its Aftermath
Fan Hui described the experience as disorienting — AlphaGo played moves that felt “wrong” by human intuition but turned out to be deeply correct. This alien quality of machine play would become a recurring theme.
Three months later, AlphaGo defeated Lee Sedol — the world’s top-ranked player — 4-1 in a globally televised match watched by 60 million people. The one game Lee won (Game 4) is considered one of the most remarkable moves in Go history: a “divine move” that temporarily confused AlphaGo’s calculations.
In 2017, AlphaGo Zero trained from scratch with no human data — only the rules — and surpassed all previous versions in 40 days. The insight was profound: in some domains, human data is a ceiling, not a floor.
Significance
AlphaGo marked several transitions simultaneously:
- From narrow to deep: Unlike Deep Blue, AlphaGo’s techniques (deep RL + neural networks) generalized to other domains
- Reinforcement learning at scale: Demonstrated that self-play could produce superhuman performance without human-labeled data
- The creative machine: AlphaGo’s “Move 37” in the Lee Sedol match — a move no human would have played — was recognized by Go grandmasters as genuinely creative
DeepMind would later apply these same principles to protein folding (AlphaFold), drug discovery, and energy optimization — demonstrating that AlphaGo was not a dead end but a template.