My gaming PC sits next to the TV in my living room and I use it like a console, I have one of those cheap blutooth wireless keyboards with trackpad for the really basic iteractions and then I just use a game controller for playing games.
Windows 11 has been fine for me, I don't interact with it much other than seeing it for a bit when launching games.
I honestly wouldn't mind giving Linux a go, the only downside is I made the mistake of buying an nvidia graphics card, I'm not sure how much of a pain it is these days but last time I tried it was a bit of a nightmare - the general wisdom at the time was to go with an AMD card.
Nvidia's Linux software is first rate -- actually a large amount of the software that would merit buying an Nvidia graphics card is Linux-only anyway. I actually briefly had an AMD card but ended up giving it away since it didn't support ~any of the projects I needed to work on. But YMMV, my anecdata is from a ML engineering perspective.
I can confirm your anecdote, based on messing with ML on a linux system in my personal time over the last few years. I don't do any work in ML, but I have never heard of anyone doing anything with ML on Windows other than maybe running some models locally.
Though I will say I have encountered issues in the past with a Linux gaming computer which experienced issues with the Nvidia drivers anytime I decided to update the distro (I was using Kubuntu at the time).
Not only has Nvidia Linux support been first rate for decades now, but their FreeBSD support is also great. The secret has been that they run the same driver on all platforms with just a shim to interface with the different kernels.
There's something quite pleasing about writing a message and living, at least, with the thought of it causing some physical action (printing) in the real world. I mean, for all we know Andrew probably ran out of printer paper hours ago so the message has gone into the ether, but it's nice to think it happened at least!
> Workers download, decompress, and materialize their shards into DuckDB databases built from Parquet files.
I'm interested to know whether the 5s query time includes this materialization step of downloading the files etc, or is this result from workers that have been "pre-warmed". Also is the data in DuckDB in memory or on disk?
hi djhworld. The 5s does not include the download/materialization step. That parts takes the worker about 1 to 2 minutes for this data set. I didn't know that this was going on HackerNews or would be this popular - I will try to get more solid stats on that part, and update the blog accordingly.
You can have GizmoEdge reference cloud (remote) data as well, but of course that would be slower than what I did for the challenge here...
The data is on disk - on locally mounted NVMe on each worker - in the form of a DuckDB database file (once the worker has converted it from parquet). I originally kept the data in parquet, but the duckdb format was about 10 to 15% faster - and since I was trying to squeeze every drop of performance - I went ahead and did that...
Thanks for the questions.
GizmoEdge is not production yet - this was just to demonstrate the art of the possible. I wanted to divide-and-conquer a huge dataset with a lot of power...
I've since learned (from a DuckDB blog) - that DuckDB seems to do better when the XFS filesytem. I used ext4 for this, so I may be able to get another 10 to 15% (maybe!).
I use traefik in my home network as the main reverse proxy.
I don't use any of the dynamic features though like labels in docker containers etc, all of it is configured using the static configuration. It's been working well but I don't think about it really.
I watched the video and enjoyed it, I think the most interesting part to me was running the distributed Llama.cpp, Jeff mentioned it seems to work in a linear fashion where processing would hop between nodes.
Which got me thinking about how do these frontier AI models work when you (as a user) run a query. Does your query just go to one big box with lots of GPUs attached and it runs in a similar way, but much faster? Do these AI companies write about how their infra works?
ServeTheHome has a few videos covering AI servers and interconnects.
And yes, they basically have 1 Tbps+ interconnects and throw tens or hundreds of GPUs at queries. Nvidia was wise to invest so much in their networking side—they have massive bandwidth between machines and shared memory, so they can run massive models with tons of cards, with minimal latency.
It's still not as good as tons of GPU attached to tons of memory on _one_ machine, but it's better than 10, 25, or 40 Gbps networking that most small homelabs would run.
I really wanted to like the vim-beancount plugin but it's just too buggy for me, so I've always just come crawling back to beancount-mode in emacs. It's the only thing I use emacs for and I use evil mode for vim keybindings :)
Over the past year or two I've just been paying for the API access and using open source frontends like LibreChat to access these models.
This has been working great for the occasional use, I'd probably top up my account by $10 every few months. I figured the amount of tokens I use is vastly smaller than the packaged plans so it made sense to go with the cheaper, pay-as-you-go approach.
But since I've started dabbling in tooling like Claude Code, hoo-boy those tokens burn _fast_, like really fast. Yesterday I somehow burned through $5 of tokens in the space of about 15 minutes. I mean, sure, the Code tool is vastly different to asking an LLM about a certain topic, but I wasn't expecting such a huge leap, a lot of the token usage is masked from you I guess wrapped up in the ever increasing context + back/forth tool orchestration, but still
$20.00 via Deepseek's api (Yes China, can have my code idc), has lasted me almost a year. Its slow, but better quality output than any of the independently hosted Deepseek models (ime). I don't really use agents or anything tho.
Agreed, I'm still trying to use up my first $5 on Deepseek. The best thing is the off-peak rate is during the US work day and is only 55 cents per million tokens. Great for use with agents cuz you never have to worry about cost or throttling.
Everyone complains about the prices of other models but there are much cheaper alternatives out there and DS is no slouch either.
I've been tempted to use NixOS for my self hosted setup but I just can't bring myself to do it.
My setup is quite simple, it's just a few VMs with one docker compose file for each. I have an ansible playbook that copies the docker compose files across and that's it. There's really nothing more to it then that, and maintenance is just upgrading the OS (Fedora Server) once the version reaches EOL. I tend to stay 1 version behind the release cycle so upgrade whenever that gets bumped.
I do use nix-darwin on my macs so I do _see_ the value of using a nix configuration, but I find it difficult to see if the effort in porting my setup to Nix is worth it in the long run, configuration files don't get written in a short time. Maybe LLMs could speed this up, but I just don't have it in me right now to make that leap
> I've been tempted to use NixOS for my self hosted setup but I just can't bring myself to do it.
I recently tried nixos and after spending a week trying it out, I switched my home network and 2 production servers to nixos. It has been running as expected for 3,4 months now and I LOVE it. Migrating the servers was way easier than the workstations. My homeserver was setup in a few hours.
I also recently bought a jetson orin nano to play and learn on and I set up nixos with jetpack-nixos there too. I know with gentoo this would have been a (much) more painful process.
I have used gentoo for over 20 years and have always felt very much at home. What annoyed me was that the compile times on older computers were simply unbearable. Compiling GHC on my 2019 dell xps just takes 6 hours or something like that.
The big difference for me was NixOS provides really simple rollbacks if something goes wrong, where with Ansible and compose files, that's possible, but you have to do it yourself
But also if you're setup is working for you, I think that's great! It sounds like you have a good system in place
I've been reading the FAQ on the stop killing games website - they are not arguing for refunds, they're arguing for a EOL plan to be put in place for games
> No, we are not asking that at all. We are in favor of publishers ending support for a game whenever they choose. What we are asking for is that they implement an end-of-life plan to modify or patch the game so that it can run on customer systems with no further support from the company being necessary. We agree that it is unrealistic to expect companies to support games indefinitely and do not advocate for that in any way.
Modifying the game so that it runs offline is supporting the game. Three years after releasing a game, the devs who made it are usually long gone, busy on other projects, the libraries and framework used are out of date, out of support, etc. at that point making any change (or even building the project!) is a significant effort.
It's a "significant effort" only because it was made this way. You can design it to be easy hostable. It's just that there is no reason to do so currently. Stop killing games tries to give some legal reason.
With system builds like this I always feel the VRAM is the limiting factor when it comes to what models you can run, and consumer grade stuff tends to max out at 16GB or (somemtimes) 24GB for more expensive models.
It does make me wonder whether we'll start to see more and more computers with unified memory architecture (like the Mac) - I know nvidia have the Digits thing which has been renamed to something else
That’s what I hope for, but everything that isn’t bananas expensive with unified memory has very low memory bandwidth. DGX (Digits), Framework Desktop, and non-Ultra Macs are all around 128 gb/s, and will produce single digits tokens per second for larger models: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inferen...
So there’s a fundamental tradeoff between cost, inference speed, and hostable model size for the foreseeable future.
Windows 11 has been fine for me, I don't interact with it much other than seeing it for a bit when launching games.
I honestly wouldn't mind giving Linux a go, the only downside is I made the mistake of buying an nvidia graphics card, I'm not sure how much of a pain it is these days but last time I tried it was a bit of a nightmare - the general wisdom at the time was to go with an AMD card.