In 2021, Go is supposed to have GC pauses on the order of several ms at worst. Not 40 ms. So it kind of is surprising, something seems to be broken there. I'm wondering whether this isn't a limitation caused by forcing a single core operation, the runtime might not be designed for that.
EDIT: Someone else noted (https://news.ycombinator.com/item?id=27085507) a discussion on Reddit where a <5ms latency was achieved in 99.9% cases, so perhaps this is indeed a subpar result.
When you allocate against a running GC it will penalize you for this (literally sleep your thread) - hence garbage tail latency. The solution is to not do allocation which is what gogo library strives for
Most of the phases of GC can be done parallel, so allocation should stop only for a really small amount of time. For comparison, Java's low-latency GC (which optimize for latency but may reduce throughput), ZGC can do worst-case <1ms pause times, that is the OS scheduler will cost more than GC.
I’m talking about Go garbage collector specifically which does mark and sweep and will slow down functions doing frequent allocations on purpose in order to catch up. This is different from stw kind of scenario e.g shenandoah
Iirc, the official protobuf module for Go still uses
reflection in the generated code as opposed to fully
generating the encoding and decoding code, so maybe that
creates additional garbage or performance issues or lock
contention. I think I remember there being an alternative
module that fully generates the code, and it would be
interesting to see that in the table as well.
You would? I wouldn't. It definitely looks like somewhat pathological case to me, at least in 2021. Maybe five years earlier the number would be appropriate, but there seems to be something wrong with Go slowing down this much at that small a heap. I'm wondering if it was tuned at all.