Follow

Gonna be presenting a demo of teaching local a to search wikipedia with "Function Calling"

BTW for anyone interested in this stuff, come join my Matrix Channel about open source AI. matrix.to/#/#userless-agents:m

@mauve Hey #ollama is avalliable with GPU acceleration the AMD Ryzen 7640, just use the rocm docker container (no need for further drivers)

@mauve Recomend using podman, to acess the GPU inside the container use sudo setsebool container_use_devices=true to make SE Linux comply with it.

If its a GPD device you can allocate more VRAM in the BIOS

@fredy_pferdi Oh that's great to know TY. I'll look into it. Is this going to use Vulkan for the GPU acceleration? I wasn't sure what my options would be since Ollama seems to only support Cuda and Metal

@mauve Personally recommend strongly to not install the ROCm drivers on your device but using them in a container instead, they are not that stabile on those chips and it can lead to your device crashing. Also officially only like an LTS Ubunut and Cent OS and a couple of GPU's are supported.

Container on the other hand is one command (use the amd rcom version further info here:
hub.docker.com/r/ollama/ollama
ollama.com/blog/amd-preview
There is no substantial performance lose using a container

@fredy_pferdi Spoke too soon, ollama dies when I try to load the model. Will need to mess with it another day :) TY again for the tip.

@mauve Distrobox is just an interface interface for Podman i think just running the already made images or building them yourself is way easier then to recreate the install manually with Distrobox.

@mauve
first allow podman to use GPU
`sudo setsebool -P container_use_devices=true`

and then just run this to start the container
`podman create --name=ollama --security-opt seccomp=unconfined --device /dev/dri --device /dev/kfd -e HSA_OVERRIDE_GFX_VERSION=10.3.0 -e HCC_AMDGPU_TARGETS=gfx1035 -e OLLAMA_DEBUG=1 -v .ollama:/root/.ollama:U,rw -p 11434:11434 -i --tty --restart unless-stopped docker.io/ollama/ollama:rocm`

@mauve with the Ryzen 7600u you may have to use -e HSA_OVERRIDE_GFX_VERSION=11.0.0 instead of -e HSA_OVERRIDE_GFX_VERSION=10.3.0 and make sure that you allocated enough RAM to the graphics card in the BIOS of your #GPD WIn 4 (in the advanced option)

@fredy_pferdi Cool TY, I found the HSA_OVERRIDE online and it ended up working great in my ubuntu container. 😁 Wish I had this for last night's demo! Also I don't have nearly enough RAM on this thing with 16 GB. TT_TT

@mauve for testing you could allocate 8gb that should be enough to run a small model while still using the os.

Screenshot of the option to allocate more vram:
reddit.com/r/gpdwin/comments/y

@fredy_pferdi Yeah the issue is my Matrix client ends up eating way too much RAM and then I start eating swap. Might also have a memory leak somewhere in my OS wasting RAM after going out of sleep mode

@mauve Yeah there are suspend issues with those GPD devices.

@fredy_pferdi Alas! It's still worth it for me to not have to use Windows or a regular laptop :P

@fredy_pferdi Yeah exactly! I'm running on it in desktop mode. Lately been thinking of just installing Manjaro on it instead since the steam bits are a bit janky for me.

@mauve I'm using the GPD Win Max 6800u with 32GB RAM 16GB of it allocated to VRAM and im running 13b models with reasonable speeds and 7b quit quickly. Recommend #Fedora and #Ollama in as #Podman container

@fredy_pferdi It hooks into your entire OS. I use github.com/ideasman42/nerd-dic with a custom script to make it easier to codehttps://github.com/RangerMauve/mauve-dictation

Since my OS is a steam OS derivative I needed to be fancy and install it in userspace: github.com/atcq/steam-dictatio

Then I have a global shortcut to toggle it on/off

@mauve uiii that sounds interesting using immutable distro to and was not able to get it to work.

@mauve Well presented and thanks for sharing. I was looking exactly for something like this.

Sign in to participate in the conversation
Mauvestodon

Escape ship from centralized social media run by Mauve.