Models from the paper "LaSeR: Reinforcement Learning with Last-Token Self-Rewarding"
Wenkai Yang
Keven16
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
1 day ago
NVIDIA Nemotron 3: Efficient and Open Intelligence
upvoted
a
paper
1 day ago
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Organizations
None yet