Claude Opus 4 Released
Narrative
Strongest Claude model yet. Extended thinking for complex reasoning. 200K context maintained. Constitutional AI v3 for improved safety. Agentic task completion.
Reality
Benchmarks excellent: 88.5% on GPQA Diamond, 96.4% on HumanEval. Extended thinking adds 3-10s latency. Agentic capabilities solid but require careful scaffolding. Safety improvements measurable.
Implication
Reinforced Anthropic quality positioning. Extended thinking differentiation vs instant reasoning. But $15/$75 pricing limited adoption vs cheaper alternatives. Quality vs cost tension heightened.