Update README.md
Browse files
README.md
CHANGED
|
@@ -138,4 +138,20 @@ We train all models using LoRA with the PEFT library. The main parameters are:
|
|
| 138 |
| optim | paged\_adamw\_32bit |
|
| 139 |
| lr\_scheduler\_type | constant |
|
| 140 |
|
| 141 |
-
Please check Appendix B of the paper for more details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
| optim | paged\_adamw\_32bit |
|
| 139 |
| lr\_scheduler\_type | constant |
|
| 140 |
|
| 141 |
+
Please check Appendix B of the paper for more details.
|
| 142 |
+
|
| 143 |
+
# Cite
|
| 144 |
+
|
| 145 |
+
If you find our work useful, please consider citing it using the following citation:
|
| 146 |
+
|
| 147 |
+
```
|
| 148 |
+
@misc{puerto2024dcot,
|
| 149 |
+
title={Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models},
|
| 150 |
+
author={Haritz Puerto and Tilek Chubakov and Xiaodan Zhu and Harish Tayyar Madabushi and Iryna Gurevych},
|
| 151 |
+
year={2024},
|
| 152 |
+
eprint={2407.03181},
|
| 153 |
+
archivePrefix={arXiv},
|
| 154 |
+
primaryClass={cs.CL},
|
| 155 |
+
url={https://arxiv.org/abs/2407.03181},
|
| 156 |
+
}
|
| 157 |
+
```
|