Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Stockfish tests improvements by also playing random games:

> Changes to game-playing code are accepted or rejected based on results of playing of tens of thousands of games on the framework against an older "reference" version of the program

https://en.wikipedia.org/wiki/Stockfish_(chess)#Fishtest

Think about it, a human makes a small tweak, then they run a lot of games with it and decide if it's worth keeping or not.

That's basically a terrible implementation of a neural network's gradient descent.

Now imagine instead of a human coming up with a small tweak, you randomly search for it and take a holistic view of the board instead of micro-heuristics. Oh, you just invented AlphaZero :)

BTW, since chess is highly parallel, you can run Stockfish on a computer cluster. Using this approach you can equalize the GPU power and have a balanced computer power match (of course, it will be totally unbalanced power-consumption wise). I'm not convinced Stockfish will win with equal computing power.



> Now imagine instead of a human coming up with a small tweak, you randomly search for it and take a holistic view of the board instead of micro-heuristics. Oh, you just invented AlphaZero :)

I'm fairly certain that the human brain does not work off of FP16 floating point numbers with back-propagated errors being calculated with differential equations to set the weights of our individual neurons. :-)

Artificial neural networks are fascinating self-learning machines. But remember: they're artificial. There's nothing "human" about LeelaZero, AlphaGo, or any other CNN. Especially because AlphaGo / LeelaZero are augmented with an exceptionally powerful MCTS search functionality (no human counts the number of positions they visit and "balances" each node... nobody does that. MCTS is an extremely powerful computer algorithm for search, also an artificial construct)

> I'm not convinced Stockfish will win with equal computing power.

Ehh? The results are 10-Leela / 8-Stockfish / 82 draws. It was an exceptionally close set of 100 games.

Note that Stockfish works with a global hash-table of chess positions it shares between threads. This methodology works with a "low" number of threads (ie: 16 threads, maybe even 64 threads). But it absolutely will not work at GPU-scale (~16,386+ SIMD threads on Vega64, or similar GPUs).

It is not going to be an easy job to "port" Stockfish properly to a GPU-based system, or even to a cluster of 100x racked up computers. How do you efficiently share a global hash table across 100x clustered computers?

I mean, you simply cannot. Stockfish simply isn't designed to scale that high. Stockfish is innately a single-node design that's constrained by the RAM.

---------

The fact of the matter is: modern systems need a higher-form of scaling. I think LeelaZero / AlphaZero are "cheating", in that they've found methodologies that allow the huge amount of GPU easily be used.

I think this is a wakeup call: that algorithms need to start looking at the GPU more carefully. Heavy compute definitely needs to start thinking about how to scale to 16,000+ threads and work on GPU-like systems.


You keep on missing the point.

Today's Stockfish running on a low end computer would still beat Stockfish from many years ago running on a much more powerful one, because it's evaluation function is significantly better.

Chess is not strictly about computing power, and neural-network evaluation functions are vastly better.


Would you please edit the uncivil bits out of your comments here? Lines like "You keep on missing the point" and "You know very well" just add acid to the mix and are against the site guidelines.

https://news.ycombinator.com/newsguidelines.html


> You keep on missing the point.

I think you misunderstand. I absolutely understand your point. I'm explicitly rejecting your point. There's a crucial difference.

The evidence laid out in this test does not necessarily lead to the conclusion you have here. There are a number of confounding factors, that if I had more time... I'd like to investigate.

True, your point is ONE possible story for what is going on here. But alas, my instincts suggest something else is going on. I think CNNs have simply been extremely well optimized for the GPU platform, and that indeed, they are one of the few algorithms that run extremely well on a GPU.

I'm curious how a well-put together "classical" chess AI would work if it were ported to a GPU. I understand that no such chess AI has ever been written, but that doesn't change my curiosity.

-------

EDIT:

> Chess is not strictly about computing power, and neural-network evaluation functions are vastly better.

I just thought of a way that would test this assertion. Instead of porting Stockfish to a GPU, port LeelaZero to a CPU. Run the Neural Net on the same set of hardware, and see who wins.

That way, Stockfish keeps its (cannot be scaled) centralized hash table / lazy SMP algorithm, while LeelaZero runs at the same compute-power that Stockfish has.


It was 14 wins for leela and 7 wins for sf, which is pretty large for the level they are playing at. Anyway people have thought about using GPU for chess engine but it was difficult to make work. (https://chess.stackexchange.com/questions/9772/cpu-v-gpu-for...). GPU and CPU have fundamentally different architecture and comparing them using just ops per second without taking into account their capabilities is missing the point.


You know very well that LeelaZero was designed to run on a GPU, just like Stockfish was designed to run on a CPU.

You can also do the reverse, it's easy to naively make Stockfish run on a GPU, it will just be performing terribly bad since it's algos will not utilize the GPU properly.


I mean, that's what needs to be done, for the test to work.

Either Stockfish's style of algorithm needs to be ported to a GPU (and I'm arguing its possible to do so efficiently. But a number of unsolved problems do have to be solved).

Or... Leela Zero needs to be ported to a CPU.

I'm not saying a naive port: I mean a port where the programmer spends a good bit of effort optimizing the implementation. That way its fair. Those are the two hypotheticals that can happen for a "fair" test.

I'm, personally, more interested in the case of Stockfish -> somehow ported to GPU, mostly because its never been done before. I mean, I don't want to do it, but if anyone ever did it, I'd be very interested in reading how they solved all of the issues. :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: