Lambent
/

braidbird-scribe-7B

Text Generation

text-generation-inference

Model card Files Files and versions

Lambent commited on Jun 16, 2024

Commit

14d84d2

·

verified ·

1 Parent(s): fc32b83

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
 From there, each of the four threads was separately task-tuned on 2 datasets each.
 Various methods of combining those via merge were tested, with this one scoring highest on EQ-Bench as an indicator.
-My understanding of the Model Stock merge method is that it mitigates task adaptation to a significant degree, but also significantly limits forgetting caused by training.
 I have hope that the adaptation, especially over two stages, is still sufficient to aid in longer contexts and multi-turn conversations from the ancestor models, and add some individual style while retaining a fair amount of their capability.
 This model's refusals are ... not nonexistent, but certainly don't rely on them.

 From there, each of the four threads was separately task-tuned on 2 datasets each.
 Various methods of combining those via merge were tested, with this one scoring highest on EQ-Bench as an indicator.
+My understanding of the Model Stock merge method is that it reduces task adaptation to a significant degree, but also significantly limits forgetting caused by training.
 I have hope that the adaptation, especially over two stages, is still sufficient to aid in longer contexts and multi-turn conversations from the ancestor models, and add some individual style while retaining a fair amount of their capability.
 This model's refusals are ... not nonexistent, but certainly don't rely on them.