I had Claude add it as an edit-prediction provider (running locally on llama.cpp on my Macbook Pro). It's been working well so far (including next-edit prediction!), though it could use more testing and tuning. If you want to try it out you can build my branch: https://github.com/ihales/zed/tree/sweep-local-edit-predicti...
If you have llama.cpp installed, you can start the model with `llama-server -hf sweepai/sweep-next-edit-1.5B --port 11434`
Other settings you can add in `edit_predictions.sweep_local` include:
- `model` - defaults to "sweepai/sweep-next-edit-1.5B"
- `max_tokens` - defaults to 2048
- `max_editable_tokens` - defaults to 600
- `max_context_tokens` - defaults to 1200
I haven't had time to dive into Zed edit predictions and do a thorough review of Claude's code (it's not much, but my rust is... rusty, and I'm short on free time right now), and there hasn't been much discussion of the feature, so I don't feel comfortable submitting a PR yet, but if someone else wants to take it from here, feel free!
This is great and similar to what I was thinking of doing at some point. I just wasn't sure if it needed to be specific to Sweep Local or if it could be a generic llama.cpp provider.
I was thinking about this too. Zed officially supports self-hosting Zeta, and so one option would be to create a proxy that uses the Zeta wire format, but is packed by llama.cpp (or any model backend). In the proxy you could configure prompts, context, templates, etc., while still using a production build of Zed. I'll give it a shot if I have time.
Then go on the down bottom AI button or that gemini like logo and then select sweep model. And also you are expected to run ollama run command and ollama serve it
ollama pull hf.co/sweepai/sweep-next-edit-1.5B
ollama run hf.co/sweepai/sweep-next-edit-1.5B
I did ask Chatgpt some parts about it tho and had to add this setting into my other settings too so ymmw but Its working for me
It's an interesting model for sure but I am unable to get tab auto_completion/inline in zed, I can ask it in summary and agentic mode of sorts and have a button at top which can generate code in file itself (which I found to be what I preferred in all this)
But I asked it to generate a simple hello world on localhost:8080 in golang and in the end it was able to but it took me like 10 minutes. But some other things like simple hello world was one shot for the most part
It's definitely an interesting model that's for sure. We need stronger model like these I can't imagine how strong it might be at 7B or 8B as iirc someone mentioned that this i think already has it or similar.
A lot of new developments are happening in here to make things smaller and I am all for it man!
I really don't know, I had asked chatgpt to create it and earlier it did give me a wrong one & I had to try out a lot of things and how it worked on my mac
I then pasted that whole convo into aistudio gemini flash to then summarize & give you the correct settings as my settings included some servers and their ip's by the zed remote feature too
Sorry that it didn't work. I um again asked from my working configuration to chatgpt and here's what I get (this may also not work or something so ymmv)