Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLVER
/
PPO-non-thinking
like
1
Safetensors
qwen2
arxiv:
2507.03112
License:
license
Model card
Files
Files and versions
xet
Community
1
main
PPO-non-thinking
Ctrl+K
Ctrl+K
1 contributor
History:
5 commits
RLVER
Update README.md
8c1aa2f
verified
23 days ago
.gitattributes
Safe
1.52 kB
initial commit
28 days ago
LICENSE
Safe
1.34 kB
Update LICENSE
28 days ago
README.md
142 Bytes
Update README.md
23 days ago
config.json
Safe
776 Bytes
Upload folder using huggingface_hub
28 days ago
generation_config.json
Safe
121 Bytes
Upload folder using huggingface_hub
28 days ago
model-00001-of-00004.safetensors
4.88 GB
xet
Upload folder using huggingface_hub
28 days ago
model-00002-of-00004.safetensors
4.03 GB
xet
Upload folder using huggingface_hub
28 days ago
model-00003-of-00004.safetensors
4.97 GB
xet
Upload folder using huggingface_hub
28 days ago
model-00004-of-00004.safetensors
1.35 GB
xet
Upload folder using huggingface_hub
28 days ago
model.safetensors.index.json
27.8 kB
Upload folder using huggingface_hub
28 days ago
tokenizer.json
Safe
7.03 MB
Upload folder using huggingface_hub
28 days ago
tokenizer_config.json
Safe
7.31 kB
Upload folder using huggingface_hub
28 days ago
vocab.json
Safe
2.78 MB
Upload folder using huggingface_hub
28 days ago