Im not saying this is a paid endorsement but the internet is dead and I wonder w...

neya · 2025-10-21T02:51:41 1761015101

For what it's worth, I'm not affiliated with Open AI (you can verify by my comment history [1] and account age) and I agree with the top comment. I do Elixir consulting primarily and nothing beats OpenAI's model at the moment for Elixir. Previously, their O3 models were quite decent. But, GPT-5 is really damn good. Claude code will unnecessarily try to complicate a problem solution.

[1] https://news.ycombinator.com/item?id=45491842

dns_snek · 2025-10-21T06:00:59 1761026459

This is hilarious because for me Cursor with GPT-5 often generates Elixir that isn't even syntactically correct. It needs to be told not to use return statements, and not to try to index linked lists as arrays. Code is painfully non-idiomatic to the point of being borderline useless even in the simpler cases. Claude Sonnet 4.5 is marginally better, but not by much. Any ambitious overhaul, refactoring or large feature ends in tears and regret.

Neither tool is worth paying even $20 a month for when it comes to Elixir, that's how little value I get out of them, and it's not because I can't afford it.

neya · 2025-10-21T06:13:47 1761027227

Gemini is also good, I recommend you try it as well. Usually my workflow is GPT-5 as the primary, but yes, as you mentioned it is not perfect. But Gemini surprisingly compliments GPT-5 for my use cases atleast. It's good at LiveView related stuff, whereas GPT-5 is more of architecting side.

Both LLMs suck if you let it do everything without architecting the solution first. So, I always instruct the high level architecture of how I want something, specifically around how the data should flow and be consumed and what I really want to avoid. With these constraints and bit of some prompt engineering, they are actually quite good.

dns_snek · 2025-10-21T10:27:49 1761042469

> Both LLMs suck if you let it do everything without architecting the solution first.

I always do that. Last time I spent an hour planning, going through the requirements, having it ask questions, only for it to completely botch the implementation.

Sure, I can treat it like a junior and spend 2-3 hours planning everything down to the individual function level and it's going to implement it alright. The code will work but it won't be idiomatic. Or I can just do it myself in 3 hours total to a much higher standard of quality, without gambling on a successful outcome, while simultaneously improving my own knowledge, understanding, and abilities.

No matter how I try to use them, agentic coding is always a net negative on my productivity (disposable one-off scripts excluded).

cpursley · 2025-10-21T18:05:05 1761069905

Try tidewave.ai, Jose made it (mcp thingy). Works well with GPT-5.

fragmede · 2025-10-21T21:10:49 1761081049

btw your website doesn't load

cpursley · 2025-10-21T22:03:36 1761084216

It's not my website, but I do use the free mcp with CC.

https://tidewave.ai

fragmede · 2025-10-21T22:20:25 1761085225

no i mean https://chasepursley.com

cpursley · 2025-10-22T08:50:02 1761123002

Ah, thanks!

johnisgood · 2025-10-21T07:28:18 1761031698

Personally I found Claude to be relatively OK at Elixir. With a lot of hand holding. My main problem when it comes to Elixir and Erlang is many amount of files. For that kind of boilerplate, it is good. Otherwise just use "erlang-skels.el" with Emacs. :D

Palmik · 2025-10-21T07:42:32 1761032552

I'm not saying this was a paid comment, but if we're going to speculate, we could just as easily ask what Anthropic would pay, if they could, to drown out a strongly pro-OpenAI take sitting at the top of their own promotional HN thread.

That said, you're right that the broader internet (Reddit especially) is heavily astroturfed. It's not unusual to see "What's the best X?" threads seeded by marketers, followed by hoard of suspiciously aligned comments.

But without actual evidence, these kind of meta comments like yours (and mine) are just a cynical noise.

vietvu · 2025-10-21T03:32:18 1761017538

I heard this opinion a lot recently. Codex is getting better, and Claude is getting worse so it's must happen sooner or later. Well, it's competition so waiting for Claude to catch up. The web Claude Code is good, but they really need to fix their quota. It's unusable. I would choose a worse model (maybe at 90%), but has better quota and usable. Not to mention GPT-5 and GPT-5-codex seems catch up or even better now.

hluska · 2025-10-21T06:24:01 1761027841

Are you really going to call someone a shill? I’d argue that you’re why the internet is dying - a million options and you had to choose the most offensive?

brigandish · 2025-10-21T05:35:05 1761024905

The only way to tell human from AI now is disagreeableness, it’s the one thing the GPTs refuse to do. I can’t stand their cloying sycophancy but at least it means that serial complainers will gain some trust, at least for as long as Americans are leading the hunt and deciding to baby us.

dr_dshiv · 2025-10-21T09:25:00 1761038700

On the other hand, formulaic disagreement underpins most of modern media; made by humans or not, it ends up as dehumanizing as a train wreck.

visiondude · 2025-10-21T01:46:21 1761011181

I completely agree with this. The amount of unprompted “I used to love Claude Code but now…” content that follows the exact same pattern feels really off. All of these people post without any prompts for comparison, and OP even refused to share specifics so we have to take his claim as ‘trust me bro’

loveparade · 2025-10-21T02:05:48 1761012348

It doesn't feel off to me because that's the exact experience I've had as well. So it's unsurprising to me that many other people share that experience. I'm sure there is a bunch of paid promotion going on for all kinds of stuff on HN (especially what gets onto the front page), but I don't think this is one of those cases.

visiondude · 2025-10-21T02:46:42 1761014802

Oh cool, can you share concrete examples of times codex out performed Claude Code? I’m my experience both tools needs to be carefully massaged with context to fulfill complex task.

typpilol · 2025-10-21T03:49:01 1761018541

In my experience. Claude wants to try and finish everything as quickly as possible where codex is happy to take 5x the length.

The best answer is each has its uses. Using codex to do bulk edits is dumb because it takes forever, etc etc

loveparade · 2025-10-21T08:52:59 1761036779

I don't really see how examples are useful because you're not going to understand the context. My prompt may be something like "We recently added a new transcription backend api (see recent git commits), integrate it into the service worker. Before implementing, create a detailed plan, ask clarifying questions, and ask for approval before writing code"

Does that help you? I doubt it. But there you go.

hluska · 2025-10-21T06:38:32 1761028712

Nobody has to give you examples. People can express opinions. If you disagree, that’s fine but requesting entire prompt and response sets is quite demanding. Who are you to be that demanding?

dns_snek · 2025-10-21T10:08:30 1761041310

> Who are you to be that demanding?

Let's call it the skeptical public? We've been listening to a group of people rave about how revolutionary these tools are, how they're able to perform senior level developer work, how good their code is, and how they're able to work autonomously through the use of sub-agents (i.e. vibe coding), without ever providing evidence that would support any of those grandiose claims.

But then I use these tools myself[1] and I speak to real developers who have used them and our evaluation centers around lukewarm, e.g. good at straightforward, junior level tasks, or good for prototyping, or good for initially generating tests, or good for answering certain types of questions, or good for one-off scripts, but approximately none of them would trust these LLMs to implement a more complex feature like a mid-level or senior developer would without very extensive guidance and hand-holding that takes longer than just doing it ourselves.

Given the overwhelming absence of evidence, the most charitable conclusion I can come to is that the vast majority of people making these claims have simply gone from being 0.2X developers to being 0.3X developers who happen to generate 5X more code per unit of time.

[1] e.g. my reply to https://news.ycombinator.com/item?id=45651948

ssk42 · 2025-10-21T14:21:46 1761056506

Context engineering is a critical part of being able to use the tool. And it's ok to not understand how to use a new tool. The different models combined with different stacks require different ways of grappling with the technology. And it all changes! It sucks that you've tried it for your stack (Elixir, whatever that is) in your way and it was disappointing.

To me, the tool inherently makes sense and vibes with my own personality. It allows me to write code that I would otherwise procrastinate on. It allows me to turn ideas into reality, so much faster.

Maybe you're just hyper focused on metrics? Productivity, especially when dealing with code, is hard to quanitfy. This is a new paradigm and so it's also hard to compare apples to oranges. Does this help?

dns_snek · 2025-10-21T15:48:15 1761061695

So your take is that every real software developer I know is simply bad at using this magical tool that performs on the level of mid-senior level software engineer in the hands of a few chosen ones? But the chosen ones never build anything in public where it can be observed, evaluated, and critiqued. How unfortunate is that?

The people I talked to use a wide variety of environments and their experience is similar across the board, whether they're working in Nodejs, React, Vue, Ruby, PHP, Java, Elixir, or Python.

> Productivity, especially when dealing with code, is hard to quanitfy.

Indeed, that's why I think most people claiming these obscene benefits are really bad at evaluating their own performance and/or started from a really low baseline.

I always think back to a study I read a while ago where people without ADHD were given stimulant medication and reported massive improvements in productivity but objective measurements showed that their real-world performance was equal to, or slightly lower than their baseline.

I think it's very relevant to the psychology behind this AI worship. Some people are being elevated from a low baseline whilst others are imagining the benefits.

ssk42 · 2025-10-21T16:44:29 1761065069

People do build in public from vibe-coding, absolutely. This tells me that you have not done your research and just gone off of general guesses or pessimism/frustration from not knowing how to use the tool. The easiest way to be able to find this on Github is to look for where Claude is a contributor. Claude will tag itself in the PR or pushes. Another easy way to that I've seen come up for this is there is a whole "BuildInPublic" tag in the Threads app which has been inundated with Vibe coding. While these might not be in your algorithm, they do exist. You'll be able to see that while there is a lot of crud that there are also products being made are actually versatile, complex, and completely vibe-coded. Most people are not making up these stories. It's very real.

dns_snek · 2025-10-21T17:50:37 1761069037

Of course people vibe-code in public - I was clear that I wanted to see evidence of these amazing productivity improvements. If people are building something decent but it takes them 3 or 4 times as long as it would take me, I don't care. That's great for them but it's worthless to me because it's not evidence of a productivity increase.

> there are also products being made are actually versatile, complex, and completely vibe-coded.

Which ones? I'm looking for repositories that are at least partially video-documented to see the author's process in action.

hattmall · 2025-10-21T02:14:46 1761012886

I'm not saying it is, but if ANYTHING was the exact combination of prerequisites to be considered paid promotion on HN, this is the type of comment it would be.

hluska · 2025-10-21T06:42:14 1761028934

So, let’s see if I get this straight. A highly identifiable person whose company sells a security product is the ideal shill? That doesn’t make any sense whatsoever. On the other hand, someone with a different opinion makes complete sense.

hattmall · 2025-10-29T04:37:05 1761712625

Lebron James endorses KIA. Multi-billion dollar companies can afford and benefit from highly identifiable people so I don't really think that argument makes it any less likely to be an endorsement.

dbbk · 2025-10-21T13:24:06 1761053046

You're absolutely right!

a_victorp · 2025-10-21T00:58:19 1761008299

This is an underrated comment

h34t · 2025-10-21T10:03:58 1761041038

to be fair, they spent a lot on compute.