Collections

Discover the best community collections!

Collections including paper arxiv:2506.20512
reasoning llm
Collection by
Oct 9, 2025
🐙 OctoThinker
Mid-training Incentivizes Reinforcement Learning Scaling
Psychology
Collection by
Jan 16
OctoThinker-Llama-8B Family
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
HF Daily
Collection by
Aug 5, 2025
VisionLM
Collection by
Jan 12
reasoning llm
Collection by
Oct 9, 2025
OctoThinker-Llama-8B Family
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
🐙 OctoThinker
Mid-training Incentivizes Reinforcement Learning Scaling
HF Daily
Collection by
Aug 5, 2025
Psychology
Collection by
Jan 16
VisionLM
Collection by
Jan 12