Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published 27 days ago • 51
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24 • 50
GUARD: Generation-time LLM Unlearning via Adaptive Restriction and Detection Paper • 2505.13312 • Published May 19
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization Paper • 2412.18279 • Published Dec 24, 2024
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs Paper • 2410.18451 • Published Oct 24, 2024 • 20
Large Language Model Unlearning via Embedding-Corrupted Prompts Paper • 2406.07933 • Published Jun 12, 2024 • 9