ll922
/

Qwen2.5-0.5B-Instruct-Align-Anything-DPO

Model card Files Files and versions

Qwen2.5-0.5B-Instruct-Align-Anything-DPO / README.md

ll922's picture

Update README.md

85d95d6 verified 5 months ago

|

history blame contribute delete

352 Bytes

	---
	license: apache-2.0
	datasets:
	- PKU-Alignment/align-anything
	base_model:
	- Qwen/Qwen2.5-0.5B-Instruct
	---

	DPO training is performed using the [Align-Anything](https://github.com/PKU-Alignment/align-anything) framework, with the PKU-Alignment/align-anything text-to-text dataset.

	DPO training report: https://api.wandb.ai/links/nlp-amct/uifw66p5