eth-nlped
/

MathDial-SFT-Qwen2.5-1.5B-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

abbatea commited on Sep 22

Commit

ce043cf

·

verified ·

1 Parent(s): 7e96257

corrected grammar

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ Repository: **[Github code for SFT Fine-tuning on MathDial](https://github.com/e
 Training input and output:
 The model was fine-tuned on the **[MathDial dataset](https://huggingface.co/datasets/eth-nlped/mathdial-chat/viewer/default/train?views%5B%5D=train&row=0)**.
-Each training example consisted of a **Instruction**, **Student's Name**, **Math Word Problem and Solution**, **The students initial approach**  as input and the **tutor’s step-by-step solution** as the target output.
 To incorporate the whole conversation, a sliding window approach was used. Every input has the same format:
 For each step in a conversation, the model input included **all previous turns** in the dialogue (sliding window), followed by the student’s next message. The model’s output was then the **next tutor response** from the dataset.
 This approach ensures the model learns to generate responses that are context-aware.

 Training input and output:
 The model was fine-tuned on the **[MathDial dataset](https://huggingface.co/datasets/eth-nlped/mathdial-chat/viewer/default/train?views%5B%5D=train&row=0)**.
+Each training example consisted of a **Instruction**, **Student's Name**, **Math Word Problem and Solution** and **The students initial approach** as input, followed by the **tutor’s step-by-step solution** as the target output.
 To incorporate the whole conversation, a sliding window approach was used. Every input has the same format:
 For each step in a conversation, the model input included **all previous turns** in the dialogue (sliding window), followed by the student’s next message. The model’s output was then the **next tutor response** from the dataset.
 This approach ensures the model learns to generate responses that are context-aware.