@fredy_pferdi I found their tool call capabilities to also be lacking. It works way faster than dense models but I think the percentage of bullshit it adds to the context makes it moot relative to tighter contexts with smarter models. 🤷

These 30B-A3B models really act like 10 3B models trying to yap over each other at the same time instead of one more competent model.

Show older
Mauvestodon

Escape ship from centralized social media run by Mauve.