Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling Paper • 2501.11651 • Published Jan 20 • 1
SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling Paper • 2506.07636 • Published Jun 9 • 1
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published 28 days ago • 202
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Paper • 2404.02893 • Published Apr 3, 2024 • 23
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Paper • 2406.12793 • Published Jun 18, 2024 • 34
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published Oct 28, 2024 • 18
Does RLHF Scale? Exploring the Impacts From Data, Model, and Method Paper • 2412.06000 • Published Dec 8, 2024
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 8 days ago • 629