Post
1066
Yet Another New Multimodal Fine-Tuning Recipe 🥧
🧑🍳 In this @HuggingFace Face Cookbook notebook, we demonstrate how to align a multimodal model (VLM) using Mixed Preference Optimization (MPO) using trl.
💡 This recipe is powered by the new MPO support in trl, enabled through a recent upgrade to the DPO trainer!
We align the multimodal model using multiple optimization objectives (losses), guided by a preference dataset (chosen vs. rejected multimodal pairs).
Check it out! ➡️ https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo
🧑🍳 In this @HuggingFace Face Cookbook notebook, we demonstrate how to align a multimodal model (VLM) using Mixed Preference Optimization (MPO) using trl.
💡 This recipe is powered by the new MPO support in trl, enabled through a recent upgrade to the DPO trainer!
We align the multimodal model using multiple optimization objectives (losses), guided by a preference dataset (chosen vs. rejected multimodal pairs).
Check it out! ➡️ https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo