Gonna be presenting a demo of teaching local a #llm to search wikipedia with "Function Calling"
Source for my demo code: https://github.com/RangerMauve/mind-goblin
Video of my talk about making an #OpenSource #LLM perform function calling on my machine.
@mauve Recomend using podman, to acess the GPU inside the container use sudo setsebool container_use_devices=true to make SE Linux comply with it.
If its a GPD device you can allocate more VRAM in the BIOS
@fredy_pferdi Interesting I may be able to get it running without a container too. https://github.com/rocm-arch/rocm-arch
@mauve Personally recommend strongly to not install the ROCm drivers on your device but using them in a container instead, they are not that stabile on those chips and it can lead to your device crashing. Also officially only like an LTS Ubunut and Cent OS and a couple of GPU's are supported.
Container on the other hand is one command (use the amd rcom version further info here:
https://hub.docker.com/r/ollama/ollama
https://ollama.com/blog/amd-preview
There is no substantial performance lose using a container
@fredy_pferdi Sweet just followed this guide to install it in my ubuntu distrobox container and it's working great :o
https://www.reddit.com/r/steamdeck_linux/comments/102hzav/guide_how_to_install_rocm_for_gpu_julia/
@fredy_pferdi Spoke too soon, ollama dies when I try to load the model. Will need to mess with it another day :) TY again for the tip.
@mauve Distrobox is just an interface interface for Podman i think just running the already made images or building them yourself is way easier then to recreate the install manually with Distrobox.
@mauve
first allow podman to use GPU
`sudo setsebool -P container_use_devices=true`
and then just run this to start the container
`podman create --name=ollama --security-opt seccomp=unconfined --device /dev/dri --device /dev/kfd -e HSA_OVERRIDE_GFX_VERSION=10.3.0 -e HCC_AMDGPU_TARGETS=gfx1035 -e OLLAMA_DEBUG=1 -v .ollama:/root/.ollama:U,rw -p 11434:11434 -i --tty --restart unless-stopped docker.io/ollama/ollama:rocm`
@fredy_pferdi Cool TY, I found the HSA_OVERRIDE online and it ended up working great in my ubuntu container. 😁 Wish I had this for last night's demo! Also I don't have nearly enough RAM on this thing with 16 GB. TT_TT
@fredy_pferdi Yeah the issue is my Matrix client ends up eating way too much RAM and then I start eating swap. Might also have a memory leak somewhere in my OS wasting RAM after going out of sleep mode
@mauve Yeah there are suspend issues with those GPD devices.
@fredy_pferdi Alas! It's still worth it for me to not have to use Windows or a regular laptop :P
@mauve for testing you could allocate 8gb that should be enough to run a small model while still using the os.
Screenshot of the option to allocate more vram:
https://www.reddit.com/r/gpdwin/comments/yfivv5/anyone_know_what_do_these_options_mean_in_gpd_win/