Follow

Having tested a bunch of projects, I gotta say that OpenHermes 2.5 is the most helpful out of the ones I can run locally.

I recently wasted a bunch of time getting Phi-2 to do some summarization work, and it just couldn't stay focused for more than a sentence or two.

· · Web · 3 · 3 · 5

Hmm, after testing the raw Phi 2 within LM Studio instead of the examples provided by HuggingFace candle, I think it's actually pretty decent after all.

Specifically I got phi 2 Q4_K_S gguf working from TheBloke.

Can't get that model running with candle since it can't seem to load the model weight.

@mauve I was tinkering with ollama for a bit, but my local hardware just isn't fast enough to make it useful.

@skryking What have you been using to run the models? I find LM Studio really nice for tinkering. lmstudio.ai/

I find Q4 quantized models work pretty well on my steam deck.

@mauve ollama will download and host the models and setup a api port for interacting with them. I've done it in a VM and locally...alas I don't have any hardware that will do much acceleration at the moment. I'm stuck with an old rx580 card and its on a windows box so rocm doesn't work very well if at all.

@skryking Nice. I only do CPU workloads. Try running phi 2 some time! It's super low in resurce usage. Particually the Q4 quantized models.

@mauve Thanks for the suggestion, I just fired it up...that one is definitely faster than llama2 on cpu mode only.

@skryking it has less innate knowledge of facts but it is pretty good at "reasoning". I'm gonna teach it to make function calls and traverse datasets + summarize stuff. 😁

@mauve I really just need to get off my lazy butt and buy a new graphics card so I can do more acceleration.

@mauve do you have any documentation / links of how you teach it to use functions?

@skryking This post by @simon is what exposed me to the idea for the first time: til.simonwillison.net/llms/pyt

I also have a slightly improved prompt here: gist.github.com/RangerMauve/19

I'll likely be publishing any new work as open source on Github. :) Probably with Rust.

@mauve @simon Interesting, I started Learning Rust yesterday as something new to muck around with as a hobby.

@skryking Nice. I've been wanting to get into Rust for years but didn't have much of a use case. Now with the candle library from HuggingFace and my latest adventures with LLMs I've had an actual reason to write something in it. :) github.com/huggingface/candle/

@mauve yeah I'm still hunting for a use case at the moment. Something non work related and interesting enough to keep my old hard to focus brain interested.

@skryking For me it was more that I can finally make this stuff work related and potentially find clients to pay me to mess with it. :P Sadly my hand pain makes computer touching less appealing off the clock.

@mauve have you heard of a community somewhere (lemmy? matrix?) where we could share our experiences?

We (@codelutin) may want to start to play with things but "it's dangerous to go alone". There is @stablehorde lead by @Db0 which is a great start (but I won't join a Discord 😔)

@lutindiscret @mauve @codelutin @Db0 you can use our lemmy instance as a community. I could also look into setting a matrix bridge to our discord you won't have access to all channels but you'd be able to interact with questions etc

@mauve there are also quantized versions of the French project Mixtral that can run locally

@laskov Oh yeah, I read their release but haven't used it yet. Was there anything specific they excelled at?

Sign in to participate in the conversation
Mauvestodon

Escape ship from centralized social media run by Mauve.