Hacker Newsnew | past | comments | ask | show | jobs | submit | HashedViking's commentslogin

Good take on that. I still think a q8 32B model with a 200k context would fit into the 48Gb VRAM of one of those modded RTX 4090.


Summarization quality depends on the question, of course mathematical or programming requests will benefit from reasoning models, but for some casual researches consisting of a few web-pages summarization is done ok even by mistral:7b, but as for the Detailed Report it would be better done by a reasoning model.


Hello there,

I'm the coauthor of this project (UI part). I've joined this project when it was below 100 stars (a week ago), motivated by the 'local' sentiment. I think all of those 'open' alternatives are just wrappers around PAID 'Open'AI APIs, which just undermines the 'Open' term. My vision for this repo is a system independent of LLM providers (and middlemen) and overpriced web-search services (5$ per 1000 search requests at Google is just insane). Initially, I just wanted to experiment a bit and didn't expect the repo to explode, so feel free to critique the UI code I hacked together over a few evenings.

The ultimate goal:

A corporation-free LLM usage (local graph database integration sounds good).

A corporation-free web search (this is a massive challenge — even SearXNG relies on Google/Bing under the hood)

So, if you feel the same join the project, and lets build something great!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: