Instructions to use TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751")
model = AutoModelForCausalLM.from_pretrained("TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751

SGLang

How to use TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751 with Docker Model Runner:
```
docker model run hf.co/TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751
```

Model Card

SFT for mathematical reasoning in our MathIF project.

Github Repository: https://github.com/TingchenFu/MathIF

Training Details

We base our experiments on the DeepScaler dataset, which contains approximately 40k math reasoning samples. We first distill long CoT reasoning traces from QwQ-32B, filtering out samples where QwQ-32B fails to generate a correct answer or the CoT exceeds 8192 tokens. This results in 18k high-quality examples.

The training is conducted using 16 NVIDIA H100 GPUs. For reinforcement learning, we adopt the GRPO framework and use verifiable outcome-based rewards. The model is trained with VeRL framework with most hyper-parameters following the default setting.

Evaluation

We use nucleus sampling (T=1.0, p=0.95) with a maximum generation length of 16,384 tokens for decoding and vLLM engine for efficient inference.

Citation

BibTeX:

@article{fu2025scaling,
  title={Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models},
  author={Fu, Tingchen and Gu, Jiawei and Li, Yafu and Qu, Xiaoye and Cheng, Yu},
  journal={arXiv preprint arXiv:2505.14810},
  year={2025}
}

Downloads last month: 3

Safetensors

Model size

2B params

Tensor type

F32

Model tree for TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-Math-1.5B

Finetuned

(202)

this model

Dataset used to train TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751

Paper for TingchenFu/sft_8k_qwen-2.5-math-1.5b_05021751

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 62