Agentic AI: From Chatbots to Autonomous Actors

Overview

In early 2025, AI systems made a qualitative leap: from conversational assistants that answer questions to autonomous agents that take actions. Three landmark releases defined this transition:

October 28, 2024: Anthropic released Claude Computer Use in public beta — the first commercially available API allowing AI to see a screen and control a keyboard and mouse
January 23, 2025: OpenAI launched Operator — an AI agent that could autonomously browse the web, fill forms, place orders, and complete multi-step tasks
February 2, 2025: OpenAI released Deep Research — an agent that conducts multi-hour autonomous research tasks, synthesizing hundreds of sources into analyst-grade reports

Together, these marked the emergence of agentic AI as a mainstream product category.

Claude Computer Use

On October 28, 2024, Anthropic made Claude’s computer use capability available via API in public beta. This enabled AI to:

Take screenshots and interpret what’s on screen
Move a cursor and click on elements
Type into text fields
Navigate applications and websites
Execute sequences of actions to complete goals

Unlike browser-automation tools (Selenium, Playwright), Claude Computer Use operated at the visual interface level — the same way a human would interact with a computer — making it generalizable to any application without custom integration.

Early demonstrations included: filling in forms, writing and running code in a terminal, navigating file systems, and completing multi-step tasks across multiple applications.

OpenAI Operator

On January 23, 2025, OpenAI launched Operator for US Pro subscribers. Powered by a new model called Computer-Using Agent (CUA) — combining GPT-4o vision with reinforcement learning — Operator could:

Browse any website autonomously
Handle login flows, shopping carts, form submissions
Book restaurants, order groceries, fill out applications
Recover from errors and try alternative approaches

Key benchmarks: OSWorld score of 38.1% (human baseline: ~72%); WebArena: 58.1%.

Operator represented the first time a major AI company shipped an autonomous web agent as a consumer product. Its limitations were also instructive: it struggled with CAPTCHAs, complex multi-page workflows, and tasks requiring real-world judgment. Operator was eventually merged into a unified “ChatGPT agent” on July 17, 2025.

OpenAI Deep Research

On February 2, 2025, OpenAI released Deep Research — an agentic tool designed for long-horizon knowledge tasks. Given a research question, it would:

Decompose the question into sub-queries
Autonomously browse and read dozens to hundreds of web sources
Synthesize findings into a structured, cited report
Complete tasks in 5–30 minutes

Powered by a version of o3 fine-tuned for browsing, Deep Research represented a new category: AI as research analyst. It produced outputs comparable to what a skilled human researcher might take hours or days to produce.

The MCP Infrastructure Layer

Underpinning the agentic ecosystem was the Model Context Protocol (MCP), released by Anthropic on November 25, 2024. MCP is an open standard that allows AI models to:

Connect to any data source (databases, file systems, APIs) through standardized connectors (“tools”)
Maintain state across multi-step task sequences
Compose multiple tools in a single workflow

By March 2026, MCP had crossed 97 million installs. The Linux Foundation announced it would take MCP under open governance — signaling its transition from proprietary protocol to foundational AI infrastructure, analogous to HTTP for the web.

Why This Matters

The shift from conversational to agentic AI represents the most significant change in how humans interact with AI systems since the public launch of ChatGPT. Key implications:

New failure modes: Agentic AI systems can cause real-world consequences — sending emails, making purchases, executing code — that may be difficult to reverse. Safety research shifted from “preventing harmful outputs” to “preventing harmful actions.”

Economic disruption acceleration: Copilot could help a developer; an agent could be a developer, lawyer, researcher, or analyst. The economic displacement potential expanded from augmentation to substitution.

Trust architecture: Agentic AI requires new frameworks: which agents have what permissions, how actions are audited, when humans stay in the loop. Enterprise adoption required solving problems of authorization, auditability, and scope limitation.

The prompt injection threat: Agents that browse the web are vulnerable to adversarial web content designed to redirect their behavior — a new attack surface with no pre-existing defense framework.