Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ayhanfuat
33 days ago
|
parent
|
context
|
favorite
| on:
650GB of Data (Delta Lake on S3). Polars vs. DuckD...
What is the point of simulating 650GB data with ~40 columns if you are going to use a single column for testing? Is that even 16GB?
KeplerBoy
33 days ago
[–]
It's a strided array and slows down memory access.
ayhanfuat
33 days ago
|
parent
[–]
It's a parquet file. Column data is stored in contiguous pages (and that's how duckdb and polars read them).
KeplerBoy
33 days ago
|
root
|
parent
[–]
Okay, wasn't aware of that.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: