llamafile is designed with the hope of being a permanently working artifact where upgrades are optional. You can upgrade to new llamafile releases in two ways. The first, is you can redownload the full weights I re-upload to Hugging Face with each release. However you might have slow Internet. In that case, you don't have to re-download the whole thing to upgrade.
What you'd do instead, is first take a peek inside using:
Congratulations. You've just created your first llamafile! It's also worth mentioning that you don't have to combine it into one giant file. It's also fine to just say:
llamafile -m wizardcoder-python-13b-v1.0.Q4_K_M.gguf -p 'write some code'
You can do that with just about any GGUF weights you find on Hugging Face, in case you want to try out other models.
What you'd do instead, is first take a peek inside using:
Then you can extract the original GGUF weights and our special `.args` file as follows: You'd then grab the latest llamafile release binary off https://github.com/Mozilla-Ocho/llamafile/releases/ along with our zipalign program, and use it to insert the weights back into the new file: Congratulations. You've just created your first llamafile! It's also worth mentioning that you don't have to combine it into one giant file. It's also fine to just say: You can do that with just about any GGUF weights you find on Hugging Face, in case you want to try out other models.Enjoy!