Lyrebird
A creative writing model based on Mistral Nemo 12B to support co-writing and other related longform writing tasks.
Creator's Comments
This is pretty good, actually. Smarter than some other nemos I've tried and with decent samplers it's not very sloppy.
Working samplers: temp 1.25-1.5, min-p 0.02-0.05, rep pen 1.01, temp first. Feels like some prompts need higher or lower temp than others. Lower temps result in sloppy mistral-isms, higher temps tap into the lora training a bit more.
Chat template is theoretically ChatML because of the base models used in the merge. However the ChatML-Names preset in SillyTavern often gives better results, YMMV.
With ChatML-Names in particular this is good at copying the style of what's already in the chat history. So if your chat history is sloppy, this likely will be too (use XTC for a bit to break it up). If your chat history isn't sloppy, this is less likely to introduce any extra. Start a conversation off with text from a good model (or better yet, human-written text), and this should follow along easily.
Has the same pacing issues any Nemo model does when asked to compose a longform story from scratch via instruct, though better than some others. Seems like it's good at dialogue (though it has a bias towards country and/or british style English accents if unspecified), and is good at 'reading between the lines' for its size as well.
I did not include any erotica or other NSFW data in the LoRA training parts of this; however, Mag-Mell contains Magnum (and Chronos, which is trained on top of a rejected Magnum) so the capability is there if you need it (it just might be a bit Claude-slop-y as I haven't optimized this part for style).
Training
The two LoRAs on this were trained at 8k (nemo-kimi-lora) and 32k (nemo-books-lora) context. As you might guess, nemo-kimi-lora is trained on outputs from kimi-k2 (dataset is public on my profile), and nemo-books-lora is trained on a bunch of books.
merged
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the Linear merge method.
Models Merged
The following models were included in the merge:
- inflatebot/MN-12B-Mag-Mell-R1 + ToastyPigeon/nemo-kimi-lora
- migtissera/Tess-3-Mistral-Nemo-12B + ToastyPigeon/nemo-books-lora
Configuration
The following YAML configuration was used to produce this model:
models:
- model: inflatebot/MN-12B-Mag-Mell-R1+ToastyPigeon/nemo-kimi-lora
parameters:
weight: 0.5
- model: migtissera/Tess-3-Mistral-Nemo-12B+ToastyPigeon/nemo-books-lora
parameters:
weight: 0.5
merge_method: linear
dtype: bfloat16
tokenizer_source: migtissera/Tess-3-Mistral-Nemo-12B
- Downloads last month
- 4