Lost in Backpropagation: The LM Head is a Gradient Bottleneck Paper โข 2603.10145 โข Published 12 days ago โข 11
Running on CPU Upgrade 200 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens ๐ 200 Explore synthetic data experiments as an interactive bookshelf
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day โข 623 items โข Updated 14 minutes ago โข 92