view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 147
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 773
view article Article Open R1: How to use OlympicCoder locally for coding +3 burtenshaw, reach-vb, lewtun, edbeeching, yagilb • Mar 20, 2025 • 63
view article Article How NuminaMath Won the 1st AIMO Progress Prize +6 yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif • Jul 11, 2024 • 128
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent +2 qgallouedec, edbeeching, ClementRomac, thomwolf • Apr 22, 2024 • 81
view article Article Constitutional AI with Open LLMs +5 vwxyzjn, lewtun, edbeeching, lvwerra, osanseviero, kashif, thomwolf • Feb 1, 2024 • 17
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 kashif, edbeeching, lewtun, lvwerra, osanseviero • Jan 18, 2024 • 83
view article Article Can foundation models label data like humans? +7 nazneen, natolambert, sheonhan, wangjean, OsvaldN97, edbeeching, lewtun, slippylolo, thomwolf • Jun 12, 2023 • 1
view article Article Creating a Coding Assistant with StarCoder +7 lewtun, natolambert, nazneen, edbeeching, teven, sheonhan, philschmid, lvwerra, srush • May 9, 2023 • 2
view article Article Creating a Coding Assistant with StarCoder +7 lewtun, natolambert, nazneen, edbeeching, teven, sheonhan, philschmid, lvwerra, srush • May 9, 2023 • 2
view article Article StackLLaMA: A hands-on guide to train LLaMA with RLHF +5 edbeeching, kashif, ybelkada, lewtun, lvwerra, nazneen, natolambert • Apr 5, 2023 • 48
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU +4 edbeeching, ybelkada, lvwerra, smangrul, lewtun, kashif • Mar 9, 2023 • 72
view article Article Train your first Decision Transformer edbeeching, ThomasSimonini • Sep 8, 2022 • 15
view article Article Introducing Decision Transformers on Hugging Face 🤗 edbeeching, ThomasSimonini • Mar 28, 2022 • 10
view article Article Introducing Decision Transformers on Hugging Face 🤗 edbeeching, ThomasSimonini • Mar 28, 2022 • 10