should've been the axe template

not_IO@lemmy.blahaj.zone · 10 days ago

should've been the axe template

vivendi@programming.dev · 9 days ago

Gonna have to shill some FOSS LLMs here

You need an inference engine. Just use a llama.cpp derivate (fuck ollama, for a few reasons) and download an open model from HuggingFace (heavily recommend mistral series, which are Apache 2 license I think but I don’t really remember)

You need to find a “quantization” of the model, you can find those from the model DNA on the right side of the screen in huggingface. You need a GGUF format to be exact.

Then all you need to do is tune some inference parameters and you’re golden.