Introducing ChatGPT Agent: From Research to Real‑World Action

July 18, 2025Provided by Utku Ege Tuluk

OpenAI has launched ChatGPT Agent, a powerful new feature that bridges the gap between research and execution by enabling the AI to interact with the internet—and execute tasks—via its own virtual computer. This marks a major evolution from its previous tools, Operator and Deep Research, by combining and expanding their capabilities into a unified, agentic experience (OpenAI).

🚀 Capabilities at a Glance

Smart multi-tool agent:
Capable of browsing websites visually or via text, running terminal commands, analyzing documents, and even crafting PowerPoint slides or Excel sheets (OpenAI, WIRED).
Real tasks, real autonomy:
The agent can check your calendar, book appointments, shop online, conduct competitor research, synthesize data into presentations, and more—all while you stay in control (The Verge).
Unified intelligence:
It merges the browsing and form–filling strength of Operator with Deep Research’s analytical power, letting it fluidly pivot between tasks within a single prompt (OpenAI).

Safety & User Control

OpenAI emphasizes that users remain in control:

Explicit confirmations required before any irreversible actions—like bookings or purchases (The Guardian).
Watch Mode ensures supervision for high-risk tasks; if users navigate away, the Agent pauses (The Verge).
Security safeguards are in place, including anti-prompt-injection systems, enhanced privacy protection, and limiting sensitive actions (e.g., financial operations are blocked for now) .
The model is treated under “High Biological and Chemical capability” protocols to mitigate misuse (OpenAI).

Benchmark Breakthroughs

On several internal benchmarks, the ChatGPT Agent exhibits cutting-edge performance:

Humanity’s Last Exam: 41.6% pass@1 (44.4% with parallel attempts) (OpenAI).
FrontierMath: Achieves 27.4% accuracy on challenging math problems by using terminal tool use, outperforming earlier models (OpenAI).
SpreadsheetBench: Reaches 45.5%, more than double Copilot in Excel’s 20% on real-world spreadsheet editing tasks (OpenAI).

Real‑World Uses

Here’s how users can leverage ChatGPT Agent:

Professional workflows: Create polished slides, analyze financial data, automate report generation, work with APIs, and update dashboards.
Personal tasks: Plan travel, book events, order groceries or ingredients, manage schedules, and prepare research reports (The Verge).

Availability & Usage

Who gets it now: Pro, Plus, and Team users can enable Agent Mode via the tools menu or by typing /agent (OpenAI).
Message caps: Pro users have 400 agent messages/month; Plus and Team users get 40/month, with extra usage via credits (OpenAI).
Coming soon: Enterprise and Education tiers will gain access later this summer. EU and Swiss availability is still pending (OpenAI).

Looking Ahead

OpenAI plans continuous iterations, aiming to reduce latency, improve output quality (e.g. slide polish), and roll out to broader audiences . This Agent signals a shift to more capable, autonomous AI—one that can reason, act, and collaborate, much like a digital assistant.