They're OK at it. I usually get more thoroughness of scenarios than a mediocre h...

They're OK at it. I usually get more thoroughness of scenarios than a mediocre human engineer (which is great!) but less thoroughness of validation and output checking than a good human engineer (which is less so).

But if you have a lot of unit tests and need to make a cross-cutting refactor you run into the same problem that you always have if all your coverage is at the unit level. Now your unit boundary is fundamentally different and you need to know how to lift and shift all the relevant tests to the relevant new places.

And so far I've been less impressed by the "agents"' attempts at cross-cutting integration testing since this usually requires selective and clever interface setup and refactoring.

LLMs have a habit of creating one-off things for particular unit test scenarios that doesn't scale well to that problem.