Anthropic Releases Claude Opus 4.6 with 1M Token Context Window

February 24, 2026Provided by Utku Ege Tuluk

Anthropic released Claude Opus 4.6 on February 5, 2026, upgrading its flagship model with a 1 million token context window, significantly stronger agentic coding performance, and a new adaptive reasoning system — all at unchanged pricing. The release marks Anthropic’s fastest iteration yet on the Opus line, arriving just three months after Opus 4.5.

Claude Opus 4.6 announcement banner from Anthropic — Image credit: Anthropic

What’s New in Opus 4.6

The headline upgrade is the 1 million token context window, now available in beta. One million tokens is enough to hold an entire enterprise codebase of thousands of files, 750 novels, or a full legal discovery set — all in a single prompt. To put this in perspective, Claude Opus 4.5’s previous context window was 200k tokens; Opus 4.6 expands that fivefold.

Equally important is how the model handles those long contexts. On MRCR v2, a needle-in-a-haystack retrieval test that buries specific facts inside massive prompts, Opus 4.6 scores 76% — compared to just 18.5% for Claude Sonnet 4.5. The model doesn’t just hold more information; it can reliably find and reason over what matters within it.

Anthropic also introduced context compaction, an automatic summarization mechanism that condenses older context when an agent approaches its limit. This allows long-running agentic tasks to continue indefinitely without hitting context walls — a persistent pain point for developers building multi-step AI pipelines.

On the reasoning side, Opus 4.6 features adaptive thinking: the model detects how much extended reasoning each task actually requires and applies effort accordingly. Developers can also override this with four explicit effort levels (low, medium, high, max) for more precise control over speed and cost.

Benchmark Performance

Terminal-Bench 2.0 benchmark comparison showing Claude Opus 4.6 leading agentic coding evaluations — Image credit: Anthropic

Opus 4.6 posts notable improvements across the major AI benchmarks:

ARC-AGI-2: 68.8% — up from 37.6% for Opus 4.5 and ahead of GPT-5.2’s 54.2%. This benchmark is specifically designed to resist memorization, making Opus 4.6’s 31-point gain over its predecessor the largest single-generation leap on the test.
Terminal-Bench 2.0: 65.4%, ranking first on this agentic coding evaluation (up from 59.8% on Opus 4.5).
GDPval-AA: Leads the field on economically valuable knowledge work in finance and legal domains, outperforming GPT-5.2 by 144 Elo points and Opus 4.5 by 190 points.
OSWorld: 72.7% on agentic computer use, up from 66.3%.
BrowseComp: Best-in-class on information retrieval from the web.
Humanity’s Last Exam: Top score on this frontier reasoning evaluation.

These numbers position Opus 4.6 as the strongest publicly available model across reasoning, coding, and long-context tasks as of its release date.

Agentic Coding and Claude Code Integration

Much of Opus 4.6’s engineering investment targets agentic use cases — scenarios where the model autonomously executes multi-step tasks over extended periods. The model plans more carefully, sustains tasks longer, and operates more reliably inside large codebases. It has also improved code review and self-debugging, catching its own mistakes before they propagate.

A standout new feature is agent teams in Claude Code, Anthropic’s terminal-based coding assistant. Developers can now spin up parallel subagents that work simultaneously on different parts of a codebase, then reconcile their outputs. For large refactors or multi-file feature implementations, this dramatically reduces end-to-end turnaround.

Additional developer-facing additions include 128k output tokens (enabling generation of very long documents and codebases in a single call), a research preview of Claude in PowerPoint, and expanded availability across AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure AI Foundry.

Pricing and Safety

Pricing is unchanged: $5 per million input tokens / $25 per million output tokens. For prompts exceeding 200k tokens (the extended context range), a premium applies at $10/$37.50. Given the capabilities added, this keeps Opus 4.6 competitively priced against GPT-5.2.

On safety, Anthropic reports that Opus 4.6 maintains equivalent or superior safety standards to competing frontier models. Evaluations included interpretability-based methods and cybersecurity probes, with low misaligned behavior rates and minimal over-refusal issues.

Early enterprise partners have been positive. Notion’s AI Lead described Opus 4.6 as feeling “less like a tool and more like a capable collaborator.” Cognition’s CEO highlighted its ability to catch edge cases and bugs that previous models missed.

What This Means

Claude Opus 4.6 represents a meaningful step-change in what’s practical for AI-assisted software development and knowledge work. The 1M token window, combined with context compaction and adaptive reasoning, removes some of the most frustrating limits of current agentic pipelines. For researchers managing large document sets, engineers working in sprawling codebases, or legal teams processing discovery, the practical headroom is substantial.

The ARC-AGI-2 result is especially notable: a benchmark explicitly designed to measure novel problem-solving (not memorization) saw a 31-point jump from one Opus version to the next. That’s the kind of gain that tends to signal genuine capability improvement rather than benchmark gaming.

Developers can access Opus 4.6 today through the Anthropic API, claude.ai, and all major cloud platforms.

Related Coverage

Introducing Claude Opus 4.5 — The November 2025 predecessor with improved coding and agentic capabilities
Anthropic Launches Claude Opus 4.1: Incremental Leap in Coding and Agentic Capabilities — Earlier Opus upgrade from August 2025
Unlocking Efficiency: Best Practices for Agentic Coding with Claude Code — Practical guide to Claude Code workflows