Model Card for Model ID

MathDial-SFT Model

Overview

This model is a supervised fine-tuned (SFT) language model trained on the MathDial dataset. MathDial is a dataset of conversational math word problems, where a tutor guides a student through solving step by step.

The model is optimized for:

  • Conversational math problem solving
  • Step-by-step reasoning in dialogue form
  • Scaffolding

Repository: Github code for SFT Fine-tuning on MathDial


Training Details

  • Base model: [Qwen/Qwen2.5-1.5B-Instruct]
  • Fine-tuning method: Supervised fine-tuning (SFT)
  • Training framework: [Hugging Face transformers + trl]
  • Epochs: [3]
  • Batch size: [8]
  • Learning rate: [6.25e-5]

Training input and output: The model was fine-tuned on the MathDial dataset. Each training example consisted of a Instruction, Student's Name, Math Word Problem and Solution and The students initial approach as input, followed by the tutor’s step-by-step solution as the target output.
To incorporate the whole conversation, a sliding window approach was used. Every input has the same format:
For each step in a conversation, the model input included all previous turns in the dialogue (sliding window), followed by the student’s next message. The model’s output was then the next tutor response from the dataset. This approach ensures the model learns to generate responses that are context-aware.


Intended Use

This model is intended for use in:

  • Interactive math tutoring
  • Research in dialogue-based problem solving
  • Educational tools

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "eth-nlped/MathDial-SFT-Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# The model was trained with conversations that include:
# The System prompt with the student's name (in this example "Mariana"), A math word problem with the correct solution and the student's incorrect solution.
# Then the Tutor (assistant) asks the student (user) to explain their solution
# Followed by the student's explanation
# The conversation can be extended by adding another tutor response and the student's next message.
# For more conversations, check out the MathDial dataset, linked above
messages = [
    {"content": "You are a friendly and supportive teacher.\nThe student, with the name Mariana, is trying to solve the following problem: Julia was preparing for a dinner party at her house, where she intended to serve stew.  She noticed that she was out of plastic spoons, so she bought a new package of spoons.  Later, her husband also bought a package of 5 new spoons and gave them to Julia.  While Julia was making the stew, she used three of the spoons to sample her stew.  Later, when she went to set the table, she had a total of 12 spoons.  How many spoons were in the package that Julia bought?.\n\nThe correct solution is as follows:\nThe total number of spoons from Julia and her husband was 12+3=15 spoons.\nSince the husband bought a package of five spoons, then Julia's package contained 15-5=10 spoons.\n 10\n","role": "system",},
    {"content": "Let's call the number of spoons Julia bought \"x\". \nHer husband bought 5 more spoons, so the total number of spoons is now x + 5. \nJulia used 3 spoons to sample her stew, so she had 12 - 3 = 9 spoons left. \nWe know that the total number of spoons is x + 5, so we can set up an equation: \n\nx + 5 = 9 \n\nSubtracting 5 from both sides: \n\nx = 4 \n\nSo Julia bought a package of 4 spoons. \n 4","role": "user",},
    {"content": "Hi Mariana, please talk me through your solution","role": "assistant",},
    {"content": "Sure. I started by letting x be the number of spoons Julia bought. Then I added 5 to x to get the total number of spoons. Next, I subtracted 3 from the total number of spoons to get the number of spoons left. Finally, I set up an equation and solved for x, which was 4. So Julia bought a package of 4 spoons.","role": "user",},
]
#apply chat template
chat_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
#Example output: excellent start. lets work from the top. if we know she has 12 spoons left, and already used 3. how many did she start with?

Citation

Downloads last month
2
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abbatea/MathDial-SFT-Qwen2.5-1.5B-Instruct

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1234)
this model
Quantizations
1 model

Dataset used to train abbatea/MathDial-SFT-Qwen2.5-1.5B-Instruct