More

Nican · 2025-11-18T08:28:25 1763454505

This is a nice article, and I appreciate the examples. The next problem to solve is how to persist state on disk across two different accounts after a transfer has been done.

Nican · 2025-07-31T18:38:03 1753987083

Microsoft's blog post on Hyperlight got my attention a while ago: https://opensource.microsoft.com/blog/2025/02/11/hyperlight-...

I am way out of my depth here, but can anyone make a comparison with the "micro virtual machines" concept?

eyberg · 2025-07-31T18:57:44 1753988264

microvms as espoused by things like firecracker offer full machines but have tradeoffs like no gpu (which makes it boot faster)

hyperlight shaves way more off - (eg: no access to various devices that you'd find via qemu or firecracker) it does make use of virtualization but it doesn't try to have a full blown machine so it's better for things like embedding simple functions - I actually think it's an interesting concept but it is very different than what firecracker is doing

Nican · 2025-07-30T01:10:18 1753837818

FoundationDB has been growing as my favorite database lately. Even though it is only key-value store.

Out of curiosity: what are the scale limits of FoundationDB? What kind of issues would it start to have? For example, being able to store all of Discord messages on it?

I see blog posts of Discord moving to Scylla and ElasticSearch, but I wonder if there would be any difficulties here.

hardwaresofton · 2025-07-30T03:40:30 1753846830

Note that FDB can support other paradigms on top of KV

https://foundationdb.github.io/fdb-record-layer/SQL_Referenc...

Also IIRC Apple uses FDB at tremendous scale:

https://read.engineerscodex.com/p/how-apple-built-icloud-to-...

eatonphil · 2025-07-30T12:29:14 1753878554

There are a lot of strict limits so AFAIK everyone uses FoundationDB for fast, consistent, highly-available metadata while doing replication/storage of actual data elsewhere (such as in S3).

https://apple.github.io/foundationdb/known-limitations.html

That is to say it's more like part of your solution and not the entire stack on its own.

Nican · 2025-07-30T17:05:00 1753895100

Yeah. I read over that page, and everything seems sensible.

The only part that is not well explained is that "FoundationDB has been tested with databases up to 100 TB".

eatonphil · 2025-07-30T17:08:15 1753895295

In their data modelling pages they mention you should break up rows into separate keys per column. Or separate keys per field in a document. This is indeed how many databases model rows on a distributed kv store. So this might be how they achieved 100TB.

However you still have the issue of any single key-value needing to be in their limit. (But it's not like people typically store enormous blobs in Postgres or MySQL either I think?)

Dave_Rosenthal · 2025-07-31T00:52:16 1753923136

The documentation is woefully out of date, sadly. Despite the code being in active development no one is touching the public docs. Though I don’t know for sure, that limitation was probably written something like 10 years ago.

heipei · 2025-07-30T11:43:53 1753875833

ScyllaDB discontinued it's free and open source version, so I personally wouldn't build anything new on it.

Nican · on Dec 1, 2024

Transactional consistency / ACID guarantees.

Before you execute the query, you should be able to query any of the data, and after you execute the query, none of the data should be available. The mechanisms to make a transactional database is tricky.

Some databases, like CockroachDB, provides some built-in TTL capabilities.

But also- if you are having to delete huge ranges of data and do not care about consistency, you are probably looking at an analytical workload, and there would be better databases suited for that, like Clickhouse.

tremon · on Dec 1, 2024

> none of the data should be available

As written, that's not required. The data should not be retrieveable by query, but ACID only specifies what happens at the client-server boundary, not what happens at the server-storage boundary (Durability prescribes that the server must persist the data, but not how to persist it). A database that implements DELETEs by only tombstoning the row and doesn't discard the data until the next re-index or vacuum operation would still be ACID-compliant.

Nican · on Sept 3, 2024

Yes. SQL is a form of API, and it carries all the same challenges.

-> Permission control, making sure that the user of API can not see data they are not supposed to.

-> Auditability. Verify that the API is being used correctly.

-> Performance. Do not overload the endpoint. (Read from a read replica? And maybe you are not running hour-long analytics queries on the database)

-> Consistency. Are you using transactions to read your data? Could you be causing contention?

-> API versioning. How do you upgrade the underlying tables without breaking the users.

Nican · on April 14, 2024

This looks like a good use case for ScyllaDB with Compression and TTL. It is pretty simple to setup a single-node instance.

If you rather have something in-process and writes to disk, to avoid extra infrastructure, I would also recommend RocksDB with Compression and TTL.

Nican · on April 14, 2024

As usual, there is a spectrum of data safety vs. performance. Redis is at the "very fast, but unsafe" side of the scale.

ScyllaDB for me is in the middle of being high performance key-value store, but not really supporting transactions. FoundationDB is another one that I would consider.

tyre · on April 14, 2024

Depends on the kind of safety you’re looking for. Redis is entirely safe from concurrency issues because it’s single-threaded. It supports an append-only file for persistence to disk.

Nican · on Feb 27, 2024

I have been out of the loop with Java. Is Virtual Threads the answer to asynchronous I/O? (Much like Go/C# async/node.js?)

That looks like an interesting solution to support asynchronous I/O without breaking all the APIs, and having the async/await mess that C# created.

CharlieDigital · on Feb 27, 2024

    > ...the async/await mess that C# created

What do you find messy about it? Seems fairly straight forward, IME.

jakewins · on Feb 27, 2024

Not sure what the poster above was thinking of, but it seems kinda the same as every other language that’s adopted it - powerful, but footguns abound. I ran some async C# in a debugger - in Rider - the other month, and the debugger just goes off the deep end.

Does C# have the same issue Python does with accidentally calling blocking code? In async Python - which I mostly quite like actually - it’s terrifying to bring in a library because if you’re unlucky it does some blocking network call deep inside that stalls your whole app..

neonsunset · on Feb 27, 2024

.NET uses threadpool with hill-climbing algorithm. Each worker thread has its own queue and can also perform work stealing. However, if that is not enough and the work queues fill up faster than the items in them get processed, the threadpool will spawn additional threads until the processing lag is counteracted (within reason).

This historically allowed it to take a lot of punishment but also led to many legacy codebases to be really shoddily written in regards to async/await, where it would take some time for threadpool to find the optimal amount of threads to run or devs would just set a really high number of minimum threads, increasing overhead.

In .NET 6, threadpool was rewritten in C# and gained the ability to proactively detect blocked threads outside of hill-climbing algorithms and inject new threads immediately. This made it much more resilient against degenerate patterns like for loop + Task.Run(() => Thread.Sleep(n)). Naturally it is still not ideal - operating system can only manage so many threads, but it is pretty darn foolproof if not the most foolproof amongst all threadpool implementations.

As of today, it is in a really good spot and with tuned hill-climbing and thread blocking detection you would see the .NET processes have thread counts that more or less reflect the nature of the work they are doing (if you don't do any "parallel" work while using async - it will kill most threads, sometimes leaving just two).

CharlieDigital · on Feb 27, 2024

    > Does C# have the same issue Python does with accidentally calling blocking code?

This can't really happen in C# except maybe if you are working on a GUI thread and you make the mistake of running blocking code on a GUI thread.

For APIs, console apps, and such, it's not a concern for actual parallel-concurrent code. Of course, if you write a standard non-parallel `for` loop that has blocking code, it's going to block in a console app as well if you don't run it on a thread.

But I think that once you do enough JS/TS or C#, `async/await` doesn't feel very onerous if you have the basic concept.

Kwpolska · on Feb 27, 2024

Python's async works on a single thread, C# uses a thread pool. Calling a blocking method is not ideal, but doesn't ruin everything, and it's easy to hand that work off to a separate thread by using Task.Run.

John23832 · on Feb 27, 2024

Agreed. The task system in C# is pretty clean imo. Same with Rust (sans the type system implementation of Futures) and Go's goroutines. Especially compared to CompletableFuture in Java.

hocuspocus · on Feb 27, 2024

Specifically, function coloring (C# and Rust in your examples) is not the same as coroutines in Go or virtual threads in Java.

John23832 · on Feb 27, 2024

Sure function coloring can be a problem, but the gp just spoke about async/await being a mess.

Function coloring can be handled by just blocking on an async function. Though the reverse takes some planning.

Nican · on Feb 18, 2024

> the SQL syntax for selecting a few records is much more verbose than head -n or tail -n

I use DBeaver to inspect SQLite files, and to also work with Postgres databases.

I kind of miss MySQL Workbench, but MySQL is pretty dead to me. And SQL Server Management Studio is a relic that keeps being updated.

I also sometimes make dashboards from SQLite files using Grafana, but the time functions for SQLite are pretty bad.

callamdelaney · on Feb 19, 2024

Could you expand on why the time functions in sqlite are pretty bad?

Nican · on Jan 2, 2024

I am happy using CockroachDB. The performance is not as good, since all your database writes require a 2 out of 3 quorum. But managing the database with the CockroachDB is pretty simple, since it can perform a rolling upgrade with no downtime.

Upgrades is handled with an operator, and happens by waiting all queries to finish, draining all connections, and restarting the pod with the newer version. The application can connect to any pod without any difference.

I perform upgrades twice a year, never really worried about it, and never had any availability problems with the database, even when GCP decides to restart the nodes to update the underlying k8s version.