Qwen3-4B-Instruct-2507 aligned using DPO on the argilla/ultrafeedback-binarized-preferences
- Downloads last month
- 360
Model tree for MInAlA/Qwen3-4B-Instruct-2507-DPO-merged
Base model
Qwen/Qwen3-4B-Instruct-2507Qwen3-4B-Instruct-2507 aligned using DPO on the argilla/ultrafeedback-binarized-preferences
Base model
Qwen/Qwen3-4B-Instruct-2507