Edit Models filters

Apps

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

85

Full-text search

Active filters: reward-model

Huanghz/align2llava-7b-lora-question

Updated May 21 • 2

Huanghz/align2llava-7b-lora-answer

nvidia/Qwen-2.5-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26 • 34 • 2

nvidia/Qwen-3-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26 • 43 • 16

zhuohaoyu/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5 • 1 • 3

mradermacher/RewardAnything-8B-v1-GGUF

8B • Updated Jul 11 • 86

WisdomShell/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5 • 775 • • 22

Skywork/Skywork-Reward-V2-Qwen3-8B

Text Classification • 8B • Updated Jul 6 • 7.52k • 17

Bifrost-AI/Qwen-3-Nemotron-32B-Reward-F16

Text Classification • 32B • Updated Jul 11 • 3

tensorblock/WisdomShell_RewardAnything-8B-v1-GGUF

Text Generation • 8B • Updated Jul 18 • 266

ulab-ai/sotopia-rl-qwen2.5-7B-rm

Feature Extraction • Updated Aug 7 • 1

ilgee/Binary-Think-RM-3B

3B • Updated 10 days ago • 7

gandhiraketla277/demo-lora-reward-model

Text Generation • Updated Aug 10 • 34

Schrieffer/Llama-SARM-4B

Reinforcement Learning • 5B • Updated 16 days ago • 49 • 1

ykorkmaz/rfm_no_failure

4B • Updated Aug 30 • 3

abraranwar/spur_metaworld

4B • Updated Aug 31

ykorkmaz/rfm_progress_only

4B • Updated Sep 1 • 4

kewu93/skywork-medarena-lora-v1

kewu93/skywork-medarena-lora-v2

Text Classification • Updated Sep 18

nabeelshan/rlhf-gpt2-pipeline

Text Generation • Updated Sep 24

Schrieffer/Llama-SARM-4B-PostSAEPretrain

5B • Updated 16 days ago • 52 • 1

dongboklee/gPRM-14B

Text Generation • Updated 27 days ago • 16 • 1

dongboklee/gPRM-14B-merged

Text Generation • 15B • Updated 27 days ago • 213 • 2

dongboklee/gORM-14B

Text Generation • Updated 27 days ago

dongboklee/gORM-14B-merged

Text Generation • 15B • Updated 27 days ago • 32 • 1

mradermacher/gPRM-14B-merged-GGUF

15B • Updated 29 days ago • 193 • 1

mradermacher/gORM-14B-merged-GGUF

15B • Updated 29 days ago • 168 • 1

dongboklee/dORM-14B

Text Classification • Updated 27 days ago • 88

dongboklee/dPRM-14B

Text Classification • Updated 27 days ago • 19

dongboklee/gORM-8B

Text Generation • Updated 27 days ago • 7