Follow

@simon Any clue why your tool with gpt4all's ggml-replit-code-v1-3b would perform worse than this replicate demo?

Is there a need to tweak the parameters for the model somewhere maybe?

replicate.com/replit/replit-co

· · Web · 1 · 0 · 0

@mauve worse in terms of speed or quality?

What operating system are you running?

@mauve looks like Replicate are running it on a A100 40GB, which is a ~$7,000 GPU!

@simon Quality of results. The same query on the ggml model via LLM vs the raw model on replicate has vastly different results. It feels like the replicate instance is more stable and has less jitter? I feel like the parameters for max tokens and heat might be the culprit.

@simon I'm on a steam deck and the speed is actually great!

@mauve yeah it would be interesting to figure out if they are using different parameters

@simon I'll dig around. Are there already llm plugins that specify how to change parameters? Might have a pr up my sleeve if there's time this weekend.

@simon Looked into this, I think the top_p and top_k are the main differences. The default in gpt4all is way more "loose".

Would a PR that sets different defaults be welcome? Or would you prefer to just have the flags exposed like your llama-cpp example?

github.com/simonw/llm-gpt4all/

docs.gpt4all.io/gpt4all_python

I'll try hardcoding some values and running a generation again to see if it's "better" in the meantime.

@mauve exposing options like llama-cop does would be fantastic

@simon PR: github.com/simonw/llm-gpt4all/

Gonna need to mess with the parameters more another day though. But my gut feeling is we can up the quality of output significantly by turning down the temperature a bit and reducing the top_p to 1 and top_k to 4 like in the replicate.com demo

@mauve my plan at the moment is to make it much easier for people to experiment with and share alternative configurations for different models

@simon Nice, like a file format for the configs so folks could pass them around and track changes in git?

Sign in to participate in the conversation
Mauvestodon

Escape ship from centralized social media run by Mauve.