Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published 8 days ago • 33
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published 8 days ago • 33
InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning Paper • 2506.12411 • Published Jun 14
Open-RS Collection Model weights & datasets in the paper "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t" • 8 items • Updated Mar 21 • 12
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges Paper • 2504.19093 • Published Apr 27 • 17