Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

curious what hardware you use? and is any of this runnable on an m1 laptop?


Absolutely, 7B will run comfortably on 16GB of RAM and most consumer level hardware. Some of the 40B run on 32GB, but it depends on the model I found (GGUF, crossing fingers help).

I ran this originally on a M1 with 32GB, I run this on an Air M2 with 16GB (and mac mini M2 32GB), no problem.

I use llama.cpp with a SwiftUI interface (my own), all native, no scripts python/js/web.

7b is obviously less capable but the instant response makes it worth exploring. It's very useful as a Google search replacement that is instantly more valuable, for general questions, than dealing with the hellscape of blog spam ruling Google atm.

Note, for my complex code queries at $dayjob where time is of the essence, I still use GPT4 plus, which is still unmatched imho, without running special hardware at least.


I've been occasionally using a 7b Q4 quant on llama.cpp on an 8GB M1. It's usable, if not amazing.


Depends on your m1 specs, but should definitely be able to run a 7b model (at least with some quantization).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: