Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I disagree, Codex always gets stuck and wants to double check and clarify things, its like "dammit just execute the plan and don't tell me until its completely finished"

The output of codex is also not as great. Codex is great at the planning and investigation portion but sucks at execution and code quality.



I've been dealing with this on Codex a lot lately. It confidently wraps up a task, I go to check it's work... and it's not even close.

Then I do a double take and re-read the summary message and realize that it pulled a "and then draw the rest of the owl", seemingly arbitrarily picking and choosing what it felt like doing in that session and what it punted over to "next steps to actually get it running".

Claude is more prone to occasional "cheating" with mocked data or "tbd: make this an actual conditional instead of hardcoded If True" stuff when it gets overwhelmed which is annoying and bad. But it at least has strong task adherence for the user's prompt and doesn't make me write a lawyer-esque contract to avoid any loopholes Codex will use to avoid doing work.


Are you using something like spec-kit?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: