Editing Models with Task Arithmetic
Paper
•
2212.04089
•
Published
•
7
This is a merge of pre-trained language models created using mergekit.
Again Nemo 12B Base was used as a base for task arithmetic merger, rather than 12B Instruct or a variant. A model fine-tuned with mostly Instruct and some narrative generation data was added, along with an abliterated Instruct model,which curiously benched higher than Nemo 12B Instruct itself.
This model was merged using the Task Arithmetic merge method using grimjim/mistralai-Mistral-Nemo-Base-2407 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
base_model: grimjim/mistralai-Mistral-Nemo-Base-2407
dtype: bfloat16
merge_method: task_arithmetic
parameters:
normalize: true
models:
- model: grimjim/mistralai-Mistral-Nemo-Base-2407
- model: grimjim/Magnolia-v3-12B
parameters:
weight: 0.85
- model: Nitral-AI/Captain_BMO-12B
parameters:
weight: 0.1
- model: natong19/Mistral-Nemo-Instruct-2407-abliterated
parameters:
weight: 0.05