We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.
Ding
dyyyyyyyy
AI & ML interests
None yet
Recent Activity
updated
a model
about 1 month ago
dyyyyyyyy/Qwen2.5-1.5B-GenRM-WithTemplate
published
a model
about 1 month ago
dyyyyyyyy/Qwen2.5-1.5B-GenRM-WithTemplate
new activity
about 1 month ago
dyyyyyyyy/Qwen2.5-1.5B-GenRM-QueryOnly:Possible issue with the new tokenizer config chat template