Hacker Newsnew | past | comments | ask | show | jobs | submit | Majromax's commentslogin

> people trolled just as hard as anyone does now,

Trolling had (has) a different character in smaller, more private forums: it tends towards more effort. A low-effort troll just gets banned and loses their platform, so the troll needs to at least ride the line of legitimacy. Drawing the line back to Usenet, the sheer effort that went into some trolling garnered respect if not necessarily acceptance.

Drive-by interactions reward volume since the 'game' isn't repeated. Curated social media feeds like Twitter are even worse; the troll has their own audience predisposed towards acceptance and the victim is just set-dressing.

I analogize this to in-person interactions: ostracization is mutually costly. A small group loses a member who was at least making a 'warm body' contribution, but the ostracized person loses a whole set of social benefits.


> We already live in the world where hackers are pwning refrigerators, I can't wait for prompt injection attacks on animatronic cartoon characters.

It's not necessarily AI controlling the communication. Disney has long had 'puppet' characters whose communication is controlled by a human behind the scenes.


They're already using similar tech for the Mickey meet and greets and the Galaxy's Edge stormtroopers. The details aren't public, but it seems to be a mix of complex dialogue trees with interrupts or context switches, controlled in real time by the actor or operator.

It's not even complex, just some pre-recorded lines that the character can trigger via finger movements. You can want them do it and it becomes very obvious.

That's interesting; if you're doing human in the loop, I would have thought it'd be easier to just do voice swapping. Or did the technology not quite line up?

Someones linked in this thread the Defunctland video essay on these characters that I highly recommend watching since it goes into this in detail.

But the main reason is, there's a lot of brand imagery on the line with these interactions, someone putting on a voice, or using a voice changer could make a mistake. Disney instead have a conversation tree with pre-recorded voice lines that a remote operator can control. Much harder to mess up


And possibly more importantly, much easier to keep doing for hours on end. There's no need for a highly trained actor.

AFAIK even the stormtroopers use prerecorded stuff when they speak, they make a specific hand gesture which triggers the voice.

Yep, in this case everything is controlled through a steam deck.

> When it comes to AI im more of a luddite at the moment, things change like every 6 months when it comes to prompting the models. [...] So taking away the award is kind of weak given people enjoyed the game.

To nitpick: the independent game awards are the Luddites here. The Luddites were a protest movement, not just a group of people unfamiliar with technology.

In the historical context that's apparently become appropriate again, Luddites violently protested the disruptive introduction of new automation in the textile industry that they argued led to reduced wages, precarious employment, and de-skilling.


What happened to the Luddites? Did they end up upskilling and living happily ever after?

> My question is "what happens if you scale up to attain the same levels of accuracy throughout? Will it still be as efficient?"

I've done some work in this area, and the answer is probably 'more efficient, but not quite as spectacularly efficient.'

In a crude, back-of-the-envelope sense, AI-NWP models run about three orders of magnitude faster than notionally equivalent physics based NWP models. Those three orders of magnitude divide approximately evenly between three factors:

1. AI-NWP models produce much sparser outputs compared to physics-based models. That means fewer variables and levels, but also coarser timesteps. If a model needs to run 10x as often to produce an output every 30m rather than every 6h, that's an order of magnitude right there.

2. AI-NWP models are "GPU native," while physics-based models emphatically aren't. Hypothetically running physics-based models on GPUs would gain most of an order of magnitude back.

3. AI-NWP models have fantastic levels of numerical intensity compared to physics-based NWP models since the former are "matrix-matrix multiplications all the way down." Traditional NWP models perform relatively little work per grid point in comparison, which puts them on the wrong (badly memory-bandwidth limited) side of the roofline plots.

I'd expect a full-throated AI-NWP model to give up most of the gains from #1 (to have dense outputs), and dedicated work on physics-based NWP might close the gap on #2. However, that last point seems much more durable to me.


> Which is surprising to me because I didn't think it would work for this; they're bad at estimating uncertainty for instance.

FGN (the model that is 'WeatherNext 2'), FourCastNet 3 (NVIDIA's offering), and AIFS-CRPS (the model from ECMWF) have all moved to train on whole ensembles, using a cumulative ranked probability score (CRPS) loss function. Minimizing the CRPS minimizes the integrated square differences of the cumulative density function between the prediction and truth, so it's effectively teaching the model to have uncertainty proportional to its expected error.

GenCast is a more classic diffusion-based model trained on a mean-squared-error-type loss function, much like any of the image diffusion models. Nonetheless it performed well.


> Are there any other benefits? Like is there a reason to believe it could be more accurate than a physics model with some error bars?

Surprisingly, the leading AI-NWP forecasts are more accurate than their traditional counterparts, even at large scales and long lead times (i.e. the 5-day forecast).

The reason for this is not at all obvious, to the point I'd call it an open question in the literature. Large-scale atmospheric dynamics are a well-studied domain, so physics-based models essentially have to be getting "the big stuff" right. It's reasonable to think that AI-NWP models are doing a better job at sub-grid parameterizations and local forcings because those are the 'gaps' in traditional NWP, but going from "improved modelling of turbulence over urban and forest areas" (as a hypothetical example) to "improvements in 10,000 km-scale atmospheric circulation 5 days later" isn't as certain.


Google Research and Google DeepMind also build their models for Google's own TPU hardware. It's only natural for them, but weather centres can't buy TPUs and can't / don't want to be locked to Google's cloud offerings.

For Gencast ('WeatherNext Gen', I believe), the repository provides instructions and caveats (https://github.com/google-deepmind/graphcast/blob/main/docs/...) for inference on GPU, and it's generally slower and more memory intensive. I imagine that FGN/WeatherNext 2 would also have similar surprises.

Training is also harder. DeepMind has only open-sourced the inference code for its first two models, and getting a working, reasonably-performant training loop written is not trivial. NOAA hasn't retrained its weights from scratch, but the fine-tuning they did re: GFS inputs still requires the full training apparatus.


> "it's more efficient if you ignore the part where it's not"

Even when you include training, the payoff period is not that long. Operational NWP is enormously expensive because high-resolution models run under soft real-time deadlines; having today's forecast tomorrow won't do you any good.

The bigger problem is that traditional models have decades of legacy behind them, and getting them to work on GPUs is nontrivial. That means that in a real way, AI model training and inference comes at the expense of traditional-NWP systems, and weather centres globally are having to strike new balances without a lot of certainty.


> In addition to Mastermind, Wordle also falls into the same category.

> Optimal play to reduce the search space in both follow the same general pattern - the next check should satisfy all previous feedback, and included entries should be the most probable ones, both of those previously tested, and those not.

The "next check should satisfy all previous feedback" part is not exactly true. That's hard-mode wordle, but hard mode is provably slower to solve than non-hard-mode (https://www.poirrier.ca/notes/wordle-optimal/) where the next guess can be inconsistent with previous feedback.


> Modern tools have made the citation process more comfortable,

That also makes some of those errors easier. A bad auto-import of paper metadata can silently screw up some of the publication details, and replacing an early preprint with the peer-reviewed article of record takes annoying manual intervention.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: