Hacker Newsnew | past | comments | ask | show | jobs | submit | kantord's commentslogin

seems like a great idea


it is very early to tell what it is


It is, and I understand the hope, but Maduro wasn't a good replacement for Chavez, yet he persisted because of the support of the rest of the regime. I hope for the best, but this is not an auspicious beginning.


That could be a new added feature, feel free to add a new issue on it


btw pdf support could probably be added to seagoat itself by adding a layer that translates the pdf files to text files and probably some added changed to make sure that the page number is also included in the results


actually I'm also working on a small web gui for it, it could be fairly easy to add speech recognition on the web version!

https://github.com/kantord/SeaGOAT-web


embeddings are done using ChromaDB

support for more complex queries could be useful, but probably not using a query language since that would make it more difficult to use free-form text input.

You can already use it using an API: https://kantord.github.io/SeaGOAT/0.27.x/server/#understandi... so probably the best way to add support for more complex queries would be to have additional query parameters, and also to expose those flags/options/features through the CLI


For those curious about it, ChromaDB uses all-MiniLM-L6-v2[0] from Sentence Transformers[1] by default.

[0] https://docs.trychroma.com/embeddings#default-all-minilm-l6-...

[1] https://www.sbert.net/docs/pretrained_models.html


btw I am also working on a web version of it that will allow you to search in multiple repositories at the same time and you will be able to self host it at work, or run it locally in your machine. https://github.com/kantord/SeaGOAT-web

so that could provide a nicer interactive experience for more complex queries


It’d be cool if it acted against Github repos, then you can save the embeddings and have a unified interface for querying repos.

I had this problem trying to learn a library and figuring out what all the functionalities are. I ended up making a non-ai solution (an emacs pkg), but this seems just a step or two away from your current project imho.


Currently it is hard limited to these file extensions: https://github.com/kantord/SeaGOAT/blob/ebfde263b970ddecdddf...

It is to avoid wasting time processing files that cannot lead to good results. If you want to try it for a different programming language, please fork the repo and try adding your file formats and test if it gives meaningful results, and if it does please submit a pull request.

Other than that one limitation is that it uses a model under the hood that is trained on a specific dataset which is filtered for a specific list of programming languages. So without changing the model as well, the support for other languages could be subpar. At the moment the model is all-MiniLM-L6-v2, here's a detailed summary of the dataset: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v...


also I plan to add features that incorporate a "dumb" analysis of the codebase in order to avoid spamming the results with mostly irrelevant results such as import statements or decorators. Those features would be language dependent, so support would need to be added for each language


extensions are configurable or truly hard coded?


it is hardcoded at the moment, but I am willing to merge code that adds the option to override.

Also probably a flag would solve it for some users, the best way would be to add a configuration option. At the moment there are no config file/.rc file support in SeaGOAT though, but there is an issue to add it and I'm happy to merge pull requests: https://github.com/kantord/SeaGOAT/issues/180


update: I changed the hardcoded set of languages to support the following:

Text Files (.txt) Markdown (.md) Python (.py) C (.c, ``.h`) C++ (.cpp, .hpp) TypeScript (.ts, .tsx) JavaScript (.js, .jsx) HTML (.html) Go (.go) Java (.java) PHP (.php) Ruby (.rb)

https://github.com/kantord/SeaGOAT#what-programming-langauge...


Based on the code, they're hardcoded. It seems like it'd be pretty straightforward to add an override flag though.


Great that they finally include USB-C. I am waiting for

- Any browser engine allowed - 3rd party "app stores" allowed - Allowed to simulate iPhone without owning or having to use a Mac

before switching from Android


1 is going to hand the web over to Google and kill the open web

2 is going to happen eventually thanks to the EU

3 never going to happen


You underestimate the power of "defaults". Also one should question Apple why they don't ship Safari on Windows/Linux. They have all the resources available if they care. EU regulates and Apple is here to blame. Protecting their revenues (WebKit). It's only surprising it took this long to regulate. On the other hand US government is laughable at not doing any pro-consumer choices and getting bribes/sudden turns etc. EU here is leading here (protecting customers instead of companies)


#1, not if the EU does some action also against Google's monopoly, similar to how in the US Microsoft was almost split to 2 companies for the same reason


Safari used to have a Windows version but it was so underused Apple canned it.


You're asking for software updates, not hardware updates.

Today is a hardware day.

WWDC (June) is their software day.


true, I will consider switching if the same things are solved for this model in the future


Are you interested in helping with the LibreLingo project? https://github.com/kantord/LibreLingo

Although the concept is a little bit different and the book content cannot be so directly adapted, I think it could still be used a nice starting point


Great project! At first glance it looks like the differences in approach are substantial, but I'll think about it some more and see if there are ways I might be able to contribute.


OT: WOW, thank you for creating this platform. It's great that free alternatives to corporate platforms exist


"recovered from communism"

Recover what? You say it as if what was before communism was better than communism.

Communism was probably the best system ever to exist in Hungary up until 1989, at least it was certainly the best to last.

It wasn't democratic by any means, but to me it's weird to say "recovered from communism", as if there was something before communism to recover.


To the person who downvoted I ask, which pre-1989 political system of Hungary was better than communism?

Hungary exists as an independent nation since 31 July 1921. Since then, these are the political systems that lasted any significant amount of time:

- Kingdom of Hungary: created anti-semitic laws before the nazis, waged wars of conquest against it's neighbors and collaborated with the Nazi regime, perpetrator of Holocaust

- 1944 until the end of world war: Nazi control of Hungary, perpetrator of Holocaust

- until 1989: a few years of Soviet military occupation, then Communism

- since 1989 current system

So which of the pre-communist political systems of Hungary is the one worth recovering in your opinion?


May I suggest you re-evaluate your history reference material?


Which pre-communist political system of Hungary do you wish to recover?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: