Nickyang
/

ConciseR-Zero-7B-Preview

Text Generation

Model card Files Files and versions

ConciseR-Zero-7B-Preview / README.md

Nickyang's picture

Update README.md

2e4fef3 verified 4 months ago

|

history blame contribute delete

2.36 kB

	---
	license: mit
	datasets:
	- Nickyang/ConciseR-Data
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- Qwen/Qwen2.5-Math-7B
	pipeline_tag: text-generation
	---
	<div align='center'>
	<h2>Walk Before You Run! <br/>Concise LLM Reasoning via Reinforcement Learning</h2>

	<!-- TODO: Paper, Models-->
	[![Paper](https://img.shields.io/badge/paper-5f16a8?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2505.21178)
	<a href="https://huggingface.co/collections/Nickyang/conciser-6827718942b90a6390db50c1" target="_blank"><img alt="Hugging Face"
	src="https://img.shields.io/badge/HuggingFace-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor"/></a>
	</div>


	## 🎉News

	- [2025/05/27] 🎉 We release [ConciseR-Zero-7B](https://huggingface.co/Nickyang/ConciseR-Zero-7B) and [ConciseR-Zero-7B-Preview](https://huggingface.co/Nickyang/ConciseR-Zero-7B-Preview).

	## Usage

	```python
	import vllm


	def apply_template(question: str):
	return ("""<\|startoftext\|>A conversation between User and Assistant. The User asks a question, and the Assistant solves it. \
	The Assistant first thinks about the reasoning process in the mind and then provides the User with the answer. \
	The reasoning process is enclosed within <think> </think> and answer is enclosed within <answer> </answer> tags, respectively, \
	i.e., <think> reasoning process here </think> <answer> answer here </answer>. \
	Please reason step by step, and put your final answer within \\boxed{}.

	User:
	{query}

	Assistant:
	""".replace("{query}", question))

	model_name = "Nickyang/ConciseR-Zero-7B-Preview"

	sampling_params = vllm.SamplingParams(
	n=32,
	temperature=0.6,
	top_p=1.0,
	max_tokens=3072,
	)

	model = vllm.LLM(
	model_name,
	max_model_len=4096,
	dtype="bfloat16",
	enable_prefix_caching=True,
	)

	prompts = [
	"How many positive whole-number divisors does 196 have?"
	]
	prompts = list(map(apply_template, prompts))
	outputs = model.generate(prompts, sampling_params)

	print(outputs)
	```

	## Citation

	```latex
	@misc{song2025conciser,
	title={Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning},
	author={Mingyang Song and Mao Zheng},
	year={2025},
	eprint={2505.21178},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2505.21178},
	}
	```