shen's picture

1 3 1

shen

lyndons1

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

upvoted a paper 13 days ago

Multiplayer Nash Preference Optimization

upvoted a paper 3 months ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

View all activity

Organizations

None yet

lyndons1 's datasets 1

lyndons1/SCI-CQA

Viewer • Updated Apr 28 • 8.25k • 16 • 1