What is so good about multi-threading?

chrisseaton · on Dec 19, 2017

For the kind of applications I work on (compilers and interpreters) we have large, mutable data structures. The best way we know as an industry to parallelise operations on these data structures is via shared memory, which either means threads, or processes with shared memory mapped into them, which is effectively the same thing (you need the same synchronisation).

For me personally, that's what's so good about multi-threading. It's the best tool for my job.

Doctor_Fegg · on Dec 19, 2017

If you're considering Crystal over Ruby, chances are that speed is part of your reasoning.

And if speed is important to you, then you want to use all 4/8/16 cores of your machine, not just one.

smitherfield · on Dec 19, 2017

Any web application is going to be almost entirely network and database-bound, not CPU-bound. CPU-bound tasks are stuff like scientific computing, video encoding, the Ruby interpreter and Crystal compiler themselves, etc.

The advantage of multithreading in a web application is not "using all cores" but rather that the application can spawn a thread to "asynchronously" wait for a response from the network or database while it continues doing other work in the main execution thread, instead of "synchronously" blocking all work while it waits.

In other words, web applications leverage concurrency with their use of multithreading, not parallelism.

I will caveat that having more cores can give concurrent applications a small performance improvement, because even though each thread is only using a CPU core at around 0 to 15% (in short bursts) of capacity, switching between two threads on the same core is a context switch, a fairly expensive operation. More cores means fewer context switches.

But, unlike with parallelism, making use of more cores is not the reason concurrency offers performance benefits. ~99% of the performance benefit of concurrency is realized on even a single-core machine, whereas parallelism by definition requires multiple cores to offer a speedup. Concurrent applications such as web servers (e.g. Apache, Nginx) typically spawn hundreds or even thousands of threads per CPU core.

cookiecaper · on Dec 19, 2017

While this argument is valid and fine, I think it has mostly played out. Previous language design choices that consciously ignored multithreading were made in the mid-late 90s when the future of consumer SMP was unclear. Now it is clear that multi-core is and will be an important component in everything -- we have quad-core phones, dual-core cameras, etc.

While an application may be fine without good/native support for parallelism in most cases, it's really nice to be using a language that allows you to bolt it on easily when it turns out that you _do_ need it. At this point, it's a language feature that's not worth leaving on the table without compelling rationale.

Arguably, the class of high-level dynamic languages that were on the rise in the mid-00s (Python, Ruby) lost a lot of steam to newcomers like Swift and Go due to the language designers' hesitation around embracing parallelism. Crystal should not repeat that mistake.

eatonphil · on Dec 19, 2017

This is exactly the reason I'm concerned about investing in Crystal (and OCaml, where multicore is also in an awkward state). I need to be confident that when I need parallelism, the language supports it (good threading primitives are imperative too, so not just pthreads).

RX14 · on Dec 19, 2017

We really really want to support parallelism and we've had some basic successes so far. We're very confident we want this, and we're confident we can come up with a performant solution: despite being a different language our parallelism solution will likely end up close to Go's which has largely solved this already.

We're not there yet but I'm confident the main barrier is time and manpower to implement it, not willpower or technical reasons.

smitherfield · on Dec 19, 2017

I'm not saying that Crystal or any language shouldn't support multithreading or parallelism, but that web applications (that is, almost all existing commercial Ruby applications) don't employ and wouldn't be able to benefit from parallelism. This is due to the laws of physics: two CPU cores connected over a network cannot communicate faster than two cores on the same silicon die, so network communication is impossible to meaningfully parallelize at the CPU level. (But, those same characteristics make concurrency a massive performance win).

>the class of high-level dynamic languages that were on the rise in the mid-00s (Python, Ruby) lost a lot of steam to newcomers like Swift and Go due to the language designers' hesitation around embracing parallelism.

Quite the opposite, Python has a dominant and rapidly-growing market share (as the developer-facing API wrapping Fortran/C/C++ libraries) in CPU-bound, massively-parallel applications (NumPy, Tensorflow, etc).

Swift and Go are odd examples to make your point: Swift's support even for concurrency[1] is a low-level and unergonomic Objective-C library, and for parallelism (or implementing other concurrency primitives) an even lower-level and less ergonomic paper-thin Objective-C wrapper around pthreads.[2] That's still better than Swift on Linux, where the only option for either is plain pthreads.

Go has excellent concurrency support (with goroutines, aka fibers, aka green threads, which execute one-at-a-time on a single CPU core and from within a single actual OS thread), but for multi-core parallelism its standard library also offers nothing more than a paper-thin wrapper around pthreads.[3]

Even C++11 offers a substantially higher-level and more ergonomic parallel-threading API than either Swift or Go.[4] Among recent, relatively mainstream, general-purpose languages, the examples that come to my mind as exemplifying strong parallelism support are Rust, Julia and recent versions of Java (Streams) and C# (TPL, PLINQ).

[1] https://developer.apple.com/documentation/dispatch

[2] https://developer.apple.com/documentation/foundation/thread

[3] https://golang.org/pkg/sync

[4] http://en.cppreference.com/w/cpp/thread

cookiecaper · on Dec 19, 2017

That the API support is awkward is not so important as that the functionality is at least accessible. Someone can always soup up a lackluster API (or, more likely, find a third-party library that has already done so for them), but elements of the core language implementation like the GIL are not so easily hand-waved away, as anyone who has followed Python's GIL saga knows.

Like I said, your argument is totally valid, and I don't contest it. It's not that most applications really need native threads, or that people are choosing the best alternatives for threaded applications. It's just that people don't want to use a language that will obstruct access to OS-native threading should they decide they need it, partially because it's hard to accurately anticipate all of an application's needs ahead of time.

fridgamarator · on Dec 19, 2017

FWIW: Crystal `HTTP::Server` can already use `SO_REUSEPORT`, allowing multiple web servers to run on a multi core machine. This is obviously only useful for web servers and is different from parallel processing.