**Mauve 👁💜** @mauve@mastodon.mauve.moe · 2024-03-08T15:30:39Z

Mauve 👁💜 @mauve@mastodon.mauve.moe

Mauve 👁💜 @mauve@mastodon.mauve.moe

Bouta skip the language model and autistically read all the training data into my own brain instead.

Mar 08, 2024, 15:30 · · Web · · ·

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:28

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:28

Mar 08, 2024, 16:28

Mauve 👁💜 @mauve@mastodon.mauve.moe

Seriously though what the hell are some of these outputs they're teaching these things "he was staring at the beautiful mexican girl" as an "answer" to a random rant.

https://huggingface.co/datasets/cognitivecomputations/dolphin?row=30

Maybe this is a side effect of using AI to generate datasets?

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:31

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:31

Mar 08, 2024, 16:31

Mauve 👁💜 @mauve@mastodon.mauve.moe

Note to self: Dolphin models are brainrot to LLMs. No wonder my dolphin-phi was unable to think for more than a few sentences

**Dym Sohin** @dym@dym.sh · Mar 08, 2024, 16:23

**Dym Sohin** @dym@dym.sh · Mar 08, 2024, 16:23

Mar 08, 2024, 16:23

Dym Sohin @dym@dym.sh

@mauve how's your retention %% ?

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:25

**Mauve 👁💜** @mauve@mastodon.mauve.moe · Mar 08, 2024, 16:25

Mar 08, 2024, 16:25

Mauve 👁💜 @mauve@mastodon.mauve.moe

@dym I only have 2 braincells so NGL it's pretty low. :P I am becoming less aligned since it's an "uncensored" dataset tho so at least there's that.

**Dym Sohin** @dym@dym.sh · Mar 08, 2024, 16:34

**Dym Sohin** @dym@dym.sh · Mar 08, 2024, 16:34

Mar 08, 2024, 16:34

Dym Sohin @dym@dym.sh

@mauve this looks like parsing out the "Answer the following question:" was the trigger to pick one of "-"-items at the end of the text

**skryking** @skryking@infosec.exchange · Mar 08, 2024, 16:39

**skryking** @skryking@infosec.exchange · Mar 08, 2024, 16:39

Mar 08, 2024, 16:39

skryking @skryking@infosec.exchange

@mauve probably a result of getting to the long tail of the probabilities. When there aren't too many occurrences of the previous thread of words..

Resources

Developers

What is Mastodon?

mastodon.mauve.moe

More…