|
|
--- |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-32B-Instruct |
|
|
language: |
|
|
- zho |
|
|
- eng |
|
|
- fra |
|
|
- spa |
|
|
- por |
|
|
- deu |
|
|
- ita |
|
|
- rus |
|
|
- jpn |
|
|
- kor |
|
|
- vie |
|
|
- tha |
|
|
- ara |
|
|
datasets: |
|
|
- IlyaGusev/saiga_preferences |
|
|
- 40umov/dostoevsky |
|
|
- Vikhrmodels/gutenpromax |
|
|
--- |
|
|
|
|
|
# Model Card for radm/Qwen2.5-32B-simpo-FP8 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
Improved quality on hard tasks by 25 percent relative to the base model Qwen2.5-32B-Instruct. Improved multilingual support. |
|
|
|
|
|
Fine-tuning on A100 in 4-bit with unsloth using SIMPO and custom dataset |
|
|
|
|
|
LoRA adapter: [radm/Qwen2.5-32B-simpo-LoRA](https://huggingface.co/radm/Qwen2.5-32B-simpo-LoRA) |
|
|
|
|
|
### Eval results |
|
|
|
|
|
Eval results on [ZebraLogic](https://github.com/WildEval/ZeroEval) |
|
|
|
|
|
 |