A model made with curated synthetic data and then KTO'd on a small curated set. Total time to train was 4 H100 hours. I quite like the results this gave despite the dataset sizes involved. Its also a lot cheaper to iterate. I plan to hand review the human data I've been using and slowly work that back into the datamix. Additionally, planning to make a focused instruction following KTO set to improve system prompt adherance and steerability.
Use chatML and minP.
- Downloads last month
- 17
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support