LLM in the Loop: Creating the PARADEHATE Dataset for Hate Speech Detoxification Paper • 2506.01484 • Published Jun 2 • 5
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks Paper • 2506.01952 • Published Jun 2 • 10
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 66