Mauve 👁💜: "@simon@simonwillison.net I'm on a steam deck and …" - Mauvestodon

Oct 06, 2023, 17:42

Mauve 👁💜 @mauve@mastodon.mauve.moe

@simon Any clue why your #LLM tool with gpt4all's ggml-replit-code-v1-3b would perform worse than this replicate demo?

Is there a need to tweak the parameters for the model somewhere maybe?

https://replicate.com/replit/replit-code-v1-3b?prediction=zarihvjb2xfluvwsplgye4bude

Oct 06, 2023, 22:24

Simon Willison @simon@simonwillison.net

@mauve worse in terms of speed or quality?

What operating system are you running?

Oct 06, 2023, 22:25

Simon Willison @simon@simonwillison.net

@mauve looks like Replicate are running it on a A100 40GB, which is a ~$7,000 GPU!

Oct 06, 2023, 22:37

Mauve 👁💜 @mauve@mastodon.mauve.moe

@simon Quality of results. The same query on the ggml model via LLM vs the raw model on replicate has vastly different results. It feels like the replicate instance is more stable and has less jitter? I feel like the parameters for max tokens and heat might be the culprit.

Mauve 👁💜 @mauve@mastodon.mauve.moe

@simon I'm on a steam deck and the speed is actually great!

Oct 06, 2023, 22:37 · · Tusky · · ·

Sign in to participate in the conversation