Collections
Discover the best community collections!
Collections including paper arxiv:2404.01954
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 62 -
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 26 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 12 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 88
-
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 26 -
UltraFeedback: Boosting Language Models with High-quality Feedback
Paper • 2310.01377 • Published • 5 -
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Paper • 2305.14387 • Published • 1 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 125
-
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 49 -
Qwen Technical Report
Paper • 2309.16609 • Published • 36 -
Gemma: Open Models Based on Gemini Research and Technology
Paper • 2403.08295 • Published • 50
-
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 26 -
UltraFeedback: Boosting Language Models with High-quality Feedback
Paper • 2310.01377 • Published • 5 -
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Paper • 2305.14387 • Published • 1 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 125
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 62 -
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 26 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 12 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 88
-
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 49 -
Qwen Technical Report
Paper • 2309.16609 • Published • 36 -
Gemma: Open Models Based on Gemini Research and Technology
Paper • 2403.08295 • Published • 50