Lyrebird

A creative writing model based on Mistral Nemo 12B to support co-writing and other related longform writing tasks.

Creator's Comments

This is pretty good, actually. Smarter than some other nemos I've tried and with decent samplers it's not very sloppy.

Working samplers: temp 1.25-1.5, min-p 0.02-0.05, rep pen 1.01, temp first. Feels like some prompts need higher or lower temp than others. Lower temps result in sloppy mistral-isms, higher temps tap into the lora training a bit more.

Chat template is theoretically ChatML because of the base models used in the merge. However the ChatML-Names preset in SillyTavern often gives better results, YMMV.

With ChatML-Names in particular this is good at copying the style of what's already in the chat history. So if your chat history is sloppy, this likely will be too (use XTC for a bit to break it up). If your chat history isn't sloppy, this is less likely to introduce any extra. Start a conversation off with text from a good model (or better yet, human-written text), and this should follow along easily.

Has the same pacing issues any Nemo model does when asked to compose a longform story from scratch via instruct, though better than some others. Seems like it's good at dialogue (though it has a bias towards country and/or british style English accents if unspecified), and is good at 'reading between the lines' for its size as well.

I did not include any erotica or other NSFW data in the LoRA training parts of this; however, Mag-Mell contains Magnum (and Chronos, which is trained on top of a rejected Magnum) so the capability is there if you need it (it just might be a bit Claude-slop-y as I haven't optimized this part for style).

Training

The two LoRAs on this were trained at 8k (nemo-kimi-lora) and 32k (nemo-books-lora) context. As you might guess, nemo-kimi-lora is trained on outputs from kimi-k2 (dataset is public on my profile), and nemo-books-lora is trained on a bunch of books.

merged

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: inflatebot/MN-12B-Mag-Mell-R1+ToastyPigeon/nemo-kimi-lora
    parameters:
      weight: 0.5
  - model: migtissera/Tess-3-Mistral-Nemo-12B+ToastyPigeon/nemo-books-lora
    parameters:
      weight: 0.5
merge_method: linear 
dtype: bfloat16
tokenizer_source: migtissera/Tess-3-Mistral-Nemo-12B
Downloads last month
4
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for allura-org/MN-Lyrebird-12B

Dataset used to train allura-org/MN-Lyrebird-12B

Collection including allura-org/MN-Lyrebird-12B