Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sarosavo
/
Master-RM
like
15
Text Classification
Transformers
Safetensors
virtuoussy/Multi-subject-RLVR
sarosavo/Master-RM
13 languages
qwen2
text-generation
text-generation-inference
arxiv:
2507.08794
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
2
Train
Deploy
Use this model
main
Master-RM
/
reward_server
12.2 kB
2 contributors
History:
2 commits
sarosavo
Upload RLVR_train.sh
a4e1f57
verified
3 months ago
RLVR_train.sh
Safe
1.58 kB
Upload RLVR_train.sh
3 months ago
launch_reward.sh
Safe
796 Bytes
upload training script and reward server script
3 months ago
model_server.py
Safe
9.86 kB
upload training script and reward server script
3 months ago