These types of tests are fundamentally flawed. I was able to create perfect cloc...

Drew_ · 2025-11-14T19:25:47 1763148347

The website is regenerating the clocks every minute. When I opened it, Gemini 2.5 was the only working one. Now, they are all broken.

Also, your example is not showing the current time.

system2 · 2025-11-14T19:42:06 1763149326

It wouldn't be hard to tell to pick up browser time as the default start point. Just a piece of prompt.

dwringer · 2025-11-14T19:29:28 1763148568

Even Gemini Flash did really well for me[0] using two prompts - the initial query and one to fix the only error I could identify.

> Please generate an analog clock widget, synchronized to actual system time, with hands that update in real time and a second hand that ticks at least once per second. Make sure all the hour markings are visible and put some effort into making a modern, stylish clock face.

Followed by:

> Currently the hands are working perfectly but they're translated incorrectly making then uncentered. Can you ensure that each one is translated to the correct position on the clock face?

[0] https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

allenu · 2025-11-14T19:26:34 1763148394

I don't think this is a serious test. It's just an art piece to contrast different LLMs taking on the same task, and against themselves since it updates every minute. One minute one of the results was really good for me and the next minute it was very, very bad.

jmdeon · 2025-11-14T19:21:43 1763148103

Aren't they attempting to also display current time though? Your share is a clock starting at midnight/noon. Kimi K2 seems to be the best on each refresh.

sinak · 2025-11-14T19:17:58 1763147878

How are they flawed?

earthnail · 2025-11-14T19:19:55 1763147995

The results are not reproducable, as evidenced by parent poster.

micromacrofoot · 2025-11-14T19:27:44 1763148464

isn't that kind of the point of non-determinism?

earthnail · 2025-11-14T22:49:04 1763160544

No. Good nondeterministic models reproducibly generate equally desirable output - not identical output, but interchangeable.

micromacrofoot · 2025-11-14T23:33:45 1763163225

oh I see, thank you for clarifying