The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published 10 days ago • 81
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 133
StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements Paper • 2408.15666 • Published Aug 28, 2024 • 11
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models Paper • 2406.18510 • Published Jun 26, 2024 • 9
Localized Symbolic Knowledge Distillation for Visual Commonsense Models Paper • 2312.04837 • Published Dec 8, 2023 • 3
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Paper • 2312.01552 • Published Dec 4, 2023 • 33
Tailoring Self-Rationalizers with Multi-Reward Distillation Paper • 2311.02805 • Published Nov 6, 2023 • 7
The Generative AI Paradox: "What It Can Create, It May Not Understand" Paper • 2311.00059 • Published Oct 31, 2023 • 20