A Samsung 990 Pro reads at something like 50 Gbps and PCIe 4.0 x4 is quite a bit faster than that. You can get this speed with a queue depth that isn’t crazy, and you can have multiple NVMe operations in flight reading the same large Parquet file. Latency is in the tens of microseconds.
The consensus seems to be that S3 can read one object at somewhat under 1Gbps. You can probable scale that to the full speed of your NIC by reading multiple objects at once, but you may not be able to scale by reading one object in multiple overlapping ranges. Latency is in the milliseconds.
So, sure, an EC2 with a fast instance and massive multiple object parallelism can have 10x higher bandwidth than an NVMe device, but the amount of parallelism and latency tolerance needed is a couple orders of magnitude higher than NVMe. Meanwhile that NVMe device does not charge for read operations and costs a couple hundred dollars, once.
If you are so inclined, you can build an NVMEoF setup (at much much higher cost) that separates compute and storage and has excellent performance, but this is a nontrivial undertaking.
A Samsung 990 Pro reads at something like 50 Gbps and PCIe 4.0 x4 is quite a bit faster than that. You can get this speed with a queue depth that isn’t crazy, and you can have multiple NVMe operations in flight reading the same large Parquet file. Latency is in the tens of microseconds.
The consensus seems to be that S3 can read one object at somewhat under 1Gbps. You can probable scale that to the full speed of your NIC by reading multiple objects at once, but you may not be able to scale by reading one object in multiple overlapping ranges. Latency is in the milliseconds.
So, sure, an EC2 with a fast instance and massive multiple object parallelism can have 10x higher bandwidth than an NVMe device, but the amount of parallelism and latency tolerance needed is a couple orders of magnitude higher than NVMe. Meanwhile that NVMe device does not charge for read operations and costs a couple hundred dollars, once.
If you are so inclined, you can build an NVMEoF setup (at much much higher cost) that separates compute and storage and has excellent performance, but this is a nontrivial undertaking.