Claude Opus 4.6
Hybrid reasoning model that pushes the frontier for coding and AI agents, featuring a 1M context window
Announcements
- NEW
Claude Opus 4.6
Feb 5, 2026
Claude Opus 4.6 is our most capable model to date. Building on the intelligence of Opus 4.5, it brings new levels of reliability and precision to coding, agents, and enterprise workflows.
Read more
Claude Opus 4.5
Nov 24, 2025
Claude Opus 4.5 is our most intelligent model to date. It sets a new standard across coding, agents, computer use, and enterprise workflows. Opus 4.5 is a meaningful step forward in what AI systems can do.
Read more
Claude Opus 4.1
Aug 5, 2025
Claude Opus 4.1 is a drop-in replacement for Opus 4 that delivers superior performance and precision for real-world coding and agentic tasks. It handles complex, multi-step problems with more rigor and attention to detail.
Read more
Claude Opus 4
May 22, 2025
Claude Opus 4 pushes the frontier in coding, agentic search, and creative writing. We’ve also made it possible to run Claude Code in the background, enabling developers to assign long-running coding tasks for Opus to handle independently.
Read more
Availability and pricing
For business users and consumers who want to collaborate with our most powerful model on complex tasks, Opus 4.6 is available on Claude for Pro, Max, Team, and Enterprise users.
For developers interested in building AI solutions that demand frontier intelligence, Opus 4.6 is available on the Claude Developer Platform natively, and in Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
Pricing for Opus 4.6 starts at $5 per million input tokens and $25 per million output tokens, with up to 90% cost savings with prompt caching and 50% savings with batch processing. To learn more, check out our pricing page. To get started, use claude-opus-4-6 via the Claude API.
For workloads that need to run in the US, US-only inference is available at 1.1x pricing for input and output tokens. Learn more.
Use cases
Opus 4.6 is a premium model that works best for tasks no prior model could handle and where performance matters most. It’s built for professional software engineering, complex agentic workflows, and high-stakes enterprise tasks.
Opus 4.6 offers hybrid reasoning that allows for instant responses or extended thinking. API users have fine-grained controls for adjusting the overall effort applied to a response, balancing performance with latency and cost. Popular use cases include:
Advanced coding
Opus 4.6 can confidently deliver production-ready code with minimal oversight. It plans carefully, runs for longer with sustained effort, and operates reliably in larger codebases. Strong code review and debugging skills means it catches its own mistakes. Senior engineers can delegate complex tasks with confidence.
AI agents
Opus 4.6 makes agents meaningfully more useful. It handles longer, more complex task chains with fewer errors and less hand-holding, adapting its approach as conditions change. It is ideal for complex, multi-step agentic workflows where reliability and autonomy matter the most.
Enterprise workflows
Opus 4.6 brings a level of consistency that makes AI practical for sustained, high-stakes work. It maintains context and quality across large projects and shows strong performance on everyday tasks like working with documents, spreadsheets, and presentations, running financial analyses, reading charts and diagrams, and doing research. It delivers the precision and consistency that enterprise work demands.
Benchmarks
Claude Opus 4.6 is state-of-the-art across a wide range of coding and agentic capabilities.
Opus 4.6 demonstrates strong performance across many domains. It achieves industry-leading results with 65.4% on Terminal-Bench 2.0. It is also our best computer-using model, reaching 72.7% on OSWorld.

Trust and safety
Extensive testing and evaluation—conducted in partnership with external experts—ensures the release of Opus 4.6 meets Anthropic’s standards for safety, security, and reliability. The accompanying model card covers safety results in depth.
Hear from our customers
Claude Opus 4.6 is a huge leap for agentic planning. It breaks complex tasks into independent subtasks, runs tools and subagents in parallel, and identifies blockers with real precision.
Claude Opus 4.6 is the best model we've tested yet. Its reasoning and planning capabilities have been exceptional at powering our AI Teammates. It's also a fantastic coding model – its ability to navigate a large codebase and identify the right changes to make is state of the art.
Claude Opus 4.6 is the strongest model Anthropic has shipped. It takes complicated requests and actually follows through; breaking them into concrete steps, executing, and producing polished work even when the task is ambitious. For Notion users, it feels less like a tool and more like a capable collaborator.
Claude Opus 4.6 reasons through complex problems at a level we haven't seen before. It considers edge cases that other models miss and consistently lands on more elegant, well-considered solutions. We're particularly impressed with Opus 4.6 in Devin Review, where it's increased our bug catching rates.
Across 40 cybersecurity investigations, Claude Opus 4.6 produced the best results 38 of 40 times in blind ranking against Claude 4.5 models. Each model ran end-to-end on the same agentic harness with up to 9 subagents and 100+ tool calls.
Claude Opus 4.6 is the new frontier on long-running tasks from our internal benchmarks and testing. It's also been highly effective at reviewing code.
Claude Opus 4.6 achieved the highest BigLaw Bench score of any Claude model at 90.2%. With 40% perfect scores and 84% above 0.8, it's remarkably capable for legal reasoning.
Claude Opus 4.6 autonomously closed 13 issues and assigned 12 issues to the right team members in a single day, managing a ~50-person organization across 6 repositories. It handled both product and organizational decisions while synthesizing context across multiple domains, and knew when to escalate to a human.
Claude Opus 4.6 is an uplift in design quality. It works beautifully with our design systems and it's more autonomous, which is core to Lovable's values. People should be creating things that matter, not micromanaging AI.
Both hands-on testing and evals show Claude Opus 4.6 is a meaningful improvement for design systems and large codebases, use cases that drive enormous enterprise value. It also one-shotted a fully functional physics engine, handling a large multi-scope task in a single pass.
Claude Opus 4.6 is the biggest leap I've seen in months. I'm more comfortable giving it a sequence of tasks across the stack and letting it run. It's smart enough to use subagents for the individual pieces.
Claude Opus 4.6 handled a multi-million-line codebase migration like a senior engineer. It planned upfront, adapted its strategy as it learned, and finished in half the time.
Global enterprise clients bring us their hardest problems. Claude Opus 4.6 sets a new bar: reasoning sustains at depth, it self-catches errors, and produces stronger outputs faster. Our 1,300 people can spend less time correcting and more time solving.
We only ship models in v0 when developers will genuinely feel the difference. Claude Opus 4.6 passed that bar with ease. Its frontier-level reasoning, especially with edge cases, helps v0 to deliver on our number one aim: to let anyone elevate their ideas from prototype to production.
Claude Opus 4.6 achieved 85% recall on our biopharma competitive intelligence benchmark—a 12-point lift over baseline (p<0.02; 100% Bayesian probability of improvement)—through autonomous 15-minute discovery loops with zero prompt tuning. On the hardest tasks, the improvement exceeded 30 points. For users who need to find every competitor, not just the obvious ones, this lift makes a critical difference.
Claude Opus 4.6 scores 69% on Terminal Bench 2 in Droid, a clear jump from Opus 4.5. For autonomous software engineering, that's a meaningful step forward.
Our hardest benchmark contains 200 analytical reasoning problems. Claude Opus 4.6 beat every model we've had in production. It's a clear candidate for production traffic.
Claude Opus 4.6 is the best orchestration model we've used for complex multi-agent work. It tracks how sub-agents are doing, proactively steers them, and terminates when needed. That kind of active management is new.
The performance jump with Claude Opus 4.6 feels almost unbelievable. Real-world tasks that were challenging for Opus suddenly became easy. This feels like a watershed moment for spreadsheet agents on Shortcut.
Claude Opus 4.6 is showing gains on solubility editing where previous models couldn't. It's the first improvement we've seen on one of the most challenging tasks in molecular design.
With Claude Opus 4.6, creating financial PowerPoints that used to take hours now takes minutes. We're seeing tangible improvements in attention to detail, spatial layout, and content structuring.
Claude Opus 4.6 generates complex, interactive apps and prototypes in Figma Make with an impressive creative range. The model translates detailed designs and multi-layered tasks into code on the first try, making it a powerful starting point for teams to explore and build ideas.
"Early testing shows Claude Opus 4.6 delivering on the complex, multi-step coding work developers face every day—especially agentic workflows that demand planning and tool calling. This starts unlocking long horizon task at the frontier."
Claude Opus 4.6 just keeps working through problems without needing to be nudged. I ran it headlessly for much longer than any model we've used before. It's significantly more persistent and agentic.
Claude Opus 4.6 represents a meaningful leap in long-context performance. In our testing, we saw it handle much larger bodies of information with a level of consistency that strengthens how we design and deploy complex research workflows. Progress in this area gives us more powerful building blocks to deliver truly expert-grade systems professionals can trust.
Claude in Excel powered by Opus 4.6 represents a significant leap forward. From due diligence to financial modeling, it’s proving to be a remarkably powerful tool for our team - taking unstructured data and intelligently working with minimal prompting to meaningfully automate complex analysis. It’s an excellent example of AI augmenting investment professionals’ capabilities in tangible, time-saving ways.
As one of Canada’s largest institutional investors, we’re constantly innovating and see AI at the forefront of shaping our future. Claude Opus 4.6's enhanced speed, precision, and capacity for complex tasks, like multi-tab analysis in Claude for Excel, unlock exciting possibilities for how we work.”
Claude Opus 4.6 feels noticeably better than Opus 4.5 in Windsurf, especially on tasks that require careful exploration like debugging and understanding unfamiliar codebases. We've noticed Claude Opus 4.6 thinks longer, which pays off when deeper reasoning is needed.
Claude Opus 4.6 is now our default model. It outperforms other models on real workloads, especially data retrieval and tool use.
Opus 4.6 is the best Anthropic model we’ve tested. It understands intent with minimal prompting and went above and beyond, exploring & creating details I didn't even know I wanted until I saw them. It felt like I was working with the model, not waiting on it.
Claude Opus 4.6 excels in high-reasoning tasks, like multi-source analysis, across legal, financial, and technical content. Box’s eval showed a 10% lift in performance, reaching 68% vs. a 58% baseline, and near-perfect scores in technical domains.
Greptile pushes the frontier on long-horizon coding and reasoning tasks. Claude Opus 4.6 marks a large step forward in this space. We are excited to use it.
Anthropic already had the best coding model in the world and Opus 4.6 continues that trajectory. In our internal Auggie bench eval, this is the first time we've consistently seen the model’s coding output truly compare to expert human quality.
Claude Opus 4.6 delivers the depth and structure our users need on complex research queries. It gives thorough, evidence-backed responses that consistently outperform what we've seen from any other model.
Frequently asked questions
We offer Claude models across the spectrum of speed, price, and performance. Opus 4.6 is our most capable model to date. We recommend Opus 4.6 for your most demanding use cases where you need frontier intelligence—particularly production-ready code, sophisticated AI agents, and complex document creation.
Pricing depends on how you want to use Opus 4.6. To learn more, check out our pricing page.
Opus 4.6 is both a standard model and a hybrid reasoning model in one. You can pick when you want the model to answer normally and when you want it to use extended thinking.
Extended thinking mode is best for use cases where performance and accuracy matter more than latency. It significantly improves response quality for complex reasoning tasks, extended agentic work, multi-step coding projects, and deep research, and the thinking summaries help you understand key aspects of the model’s reasoning process.