This pretty cool, and useful but I only wish this was a website. I don’t like the idea of running an executable for something that can perfectly be done as a website. (Other than some minor features, tbh even you can enable Corsair and still check the installed models from a web browser).
How it works
Hardware detection -- Reads total/available RAM via sysinfo, counts CPU cores, and probes for GPUs:
NVIDIA -- Multi-GPU support via nvidia-smi. Aggregates VRAM across all detected GPUs. Falls back to VRAM estimation from GPU model name if reporting fails.
AMD -- Detected via rocm-smi.
Intel Arc -- Discrete VRAM via sysfs, integrated via lspci.
Apple Silicon -- Unified memory via system_profiler. VRAM = system RAM.
Ascend -- Detected via npu-smi.
Backend detection -- Automatically identifies the acceleration backend (CUDA, Metal, ROCm, SYCL, CPU ARM, CPU x86, Ascend) for speed estimation.
Therefore, a website running Javascript is restricted by the browser sandbox so can't see the same low-level details such as total system RAM, exact count of GPUs, etc,
To implement your idea so it's only a website and also workaround the Javascript limitations, a different kind of workflow would be needed. E.g. run macOS system report to generate a .spx file, or run Linux inxi to generate a hardware devices report... and then upload those to the website for analysis to derive a "LLM best fit". But those os report files may still be missing some details that the github tool gathers.
Another way is to have the website with a bunch of hardware options where the user has to manually select the combination. Less convenient but then again, it has the advantage of doing "what-if" scenarios for hardware the user doesn't actually have and is thinking of buying.
(To be clear, I'm not endorsing this particular github tool. Just pointing out that a LLMfit website has technical limitations.)
No, I'm asking why a website that someone could fill in a few fields and result in the optimized llm for you would need to run in a container? It's a webform.
I just discovered the other day the hugging face allows you to do exactly this.
With the caveat that you enter your hardware manually. But are we really at the point yet where people are running local models without knowing what they are running them on..?
> But are we really at the point yet where people are running local models without knowing what they are running them on..?
I can only speak for myself: it can be daunting for a beginner to figure out which model fits your GPU, as the model size in GB doesn't directly translate to your GPU's VRAM capacity.
There is value in learning what fits and runs on your system, but that's a different discussion.
i wouldn't mind a set of well-known unix commands that produce a text output of your machine stats to paste into this hypothetical website of yours (think: neofetch?)
In your preferences there is a local apps and hardware, I guess it's a little different because I just open the page of a model and it shows the hardware I've configured and shows me what quants fit.
I haven't seen a page on HF that'll show me "what models will fit", it's always model by model. The shared tool gives a list of a whole bunch of models, their respective scores, and an estimated tok/s, so you can compare and contrast.
I wish it didn't require to run on the machine though. Just let me define my spec on a web page and spit out the results.
I’m deleting my account as well, is there a way to export all chats to Claude or just Download to later load into a local LLM?
edit: Profile > Settings > Data Control > Export
Unfortunately Claude doesn't seem to have anyway to export these chats, no SDK, no native way of doing it, and I cannot think of a way other than hacky browser automation which might even trigger a ban.
You will probably never actually be able to create actual Claude chats from OpenAI chats, but you could ask Claude to read and distill your old OpenAI chats into Claude chat context. It won’t be the same, but it’s better than nothing, depending on what you’re hoping to get out of it.
The real story hear your doctor actually listened to you. I appreciate what a lot doctors do, but majority of them fucking irritating and don’t even listen your issues, I’m glad we have AI and less reliant on them.
I mean - obviously if they're not listening their chance of the latter is pretty low.
Doctors hate to hear this, but if you're so poor in communication and social skills that the patient can't/won't follow you any care you've given, your value is lost.
Exactly my experience, I know they vibe code features and that’s fine but it looks like they don’t do proper testing which is surprising to me because all you need bunch of cheap interns to some decent enough testing
No there is a wide gap between good and bad testers. Great testers are worth their weight in gold and delight in ruining programmer's days all day long.
IMO not a good place to skimp and a GREAT place to spend for talent.
> Great testers are worth their weight in gold and delight in ruining programmer's days all day long.
Site note: all the great testers I've know when my employers had separate QA departments all ended up becoming programmers, either by studying on the side or through in-house mentorship. By all second hand accounts they've become great programmers too.
So true. My first job was in QA. Involuntarily, because I applied for a dev role, but they only had an opening for QA. I took the job because of the shiny company name on my resume. Totally changed my perspective of quality and finding issues. Even though I liked the job, it has some negative vibes because you are always the guy bringing bad news / critizing others peoples work (more or less).
Also some developers couldn't react professionally to me finding bugs in their code. One dev team lead called me "person non grata" when coming over to their desk. I took it with pride.
Eventually I transitioned to develoment because I did not see any career path or me in QA (team lead positions were filled with people doing the job for 20+ years).
They bring down production because the version string was changed incorrectly to add an extra date. That would have been picked up in even the most basic testing since the app couldn't even start.
I mean if you are not connecting it to the real things why even bother, just chatgpt or Claude online at that point.
We have enough assistants, the key idea with opeclaw is it can do stuff instead of talk with what you have. It’s terrible security but that’s the only way it makes sense. Otherwise it’s just a lot of hoops to combine cron jobs with a AI agent on the cloud that can do things an report back.
Not that I think anyone should do it, it’s a recipe for disaster
Yeah, it's like saying you can hire a con artist as your personal assistant as long as they work from a sealed box and just pass little reviewed paper slips back and forth through a slit. Why have one at that point? Very difficult to be 'assisted' without granting access.
reply