When you say Claude Code, what model do you refer to? CC with Opus still outperforms Codex (gpt-5-codex) for me for anything I do (Rust, computer graphics-related).
However, Anthropic restricted Opus use for Max plan users 10 days or so ago severly (12-fold from 40h/week down to 5h week) [1].
Sonnet is a vastly inferioir model for my use cases (but still frequently writes better Rust code than Codex). So now I use Codex for planning and Sonnet for writing the code. However, I usually need about 3--5 loops with Codex reviewing, Sonnet fixing, rinse & repeat.
Before I could use one-shot Opus and review myself directly, and do one polish run following my review (also via Opus). That was possible from June--mid October but no more.
Agreed that Opus is stronger than Sonnet 4.5 and GPT-5 High. It's the bitter pill - bigger, more expensive models are just "smarter", even if it doesn't always show in synthetic benchmarks. Similar with o1-pro (now almost a year old, an eternity in this space) vs GPT-5 high. There's also GPT-5 Pro now, which comes at an API cost of $120/M output, and is also noticeably smarter, just like Opus.
They all like to push synthetic benchmarks for marketing, but to me there's zero doubt that both Anthropic and OpenAI are well aware that they're not representative of logical thinking and creativity.
However, Anthropic restricted Opus use for Max plan users 10 days or so ago severly (12-fold from 40h/week down to 5h week) [1].
Sonnet is a vastly inferioir model for my use cases (but still frequently writes better Rust code than Codex). So now I use Codex for planning and Sonnet for writing the code. However, I usually need about 3--5 loops with Codex reviewing, Sonnet fixing, rinse & repeat.
Before I could use one-shot Opus and review myself directly, and do one polish run following my review (also via Opus). That was possible from June--mid October but no more.
[1] https://github.com/anthropics/claude-code/issues/8449