@simon Any clue why your tool with gpt4all's ggml-replit-code-v1-3b would perform worse than this replicate demo?

Is there a need to tweak the parameters for the model somewhere maybe?


@mauve worse in terms of speed or quality?

What operating system are you running?

@mauve looks like Replicate are running it on a A100 40GB, which is a ~$7,000 GPU!

@simon Quality of results. The same query on the ggml model via LLM vs the raw model on replicate has vastly different results. It feels like the replicate instance is more stable and has less jitter? I feel like the parameters for max tokens and heat might be the culprit.


@simon I'm on a steam deck and the speed is actually great!

Sign in to participate in the conversation

Escape ship from centralized social media run by Mauve.