Post
2781
𦫠We have just released argilla/Capybara-Preferences in collaboration with Kaist AI ( @JW17 , @nlee-208 ) and Hugging Face ( @lewtun )
A new synthetic preference dataset built using
The current dataset combines the already generated alternative completions from argilla/distilabel-capybara-dpo-7k-binarized, while also adding the remaining ones using the same approach!
Here are some key features on how we built it:
- π§Ή Duplicate removal, keeping the conversation besides the last assistant response, and some slight pre-processing
- π€ Generation of alternative completions for the existing conversations (last turn only) with: mlabonne/NeuralBeagle14-7B, argilla/notus-7b-v1, and teknium/OpenHermes-2.5-Mistral-7B
- π¨π»βπ« Running UltraFeedback via GPT-4 to generate the critique i.e. ratings and rationales, for the last assistant responses
- π Finally, we selected the chosen and rejected responses based on their UltraFeedback score, and applied some slight post-processing!
Sounds simple right? Start building your own synthetic datasets with https://github.com/argilla-io/distilabel already!
A new synthetic preference dataset built using
distilabel on top of the awesome LDJnr/Capybara from @LDJnr The current dataset combines the already generated alternative completions from argilla/distilabel-capybara-dpo-7k-binarized, while also adding the remaining ones using the same approach!
Here are some key features on how we built it:
- π§Ή Duplicate removal, keeping the conversation besides the last assistant response, and some slight pre-processing
- π€ Generation of alternative completions for the existing conversations (last turn only) with: mlabonne/NeuralBeagle14-7B, argilla/notus-7b-v1, and teknium/OpenHermes-2.5-Mistral-7B
- π¨π»βπ« Running UltraFeedback via GPT-4 to generate the critique i.e. ratings and rationales, for the last assistant responses
- π Finally, we selected the chosen and rejected responses based on their UltraFeedback score, and applied some slight post-processing!
Sounds simple right? Start building your own synthetic datasets with https://github.com/argilla-io/distilabel already!