Memory-Efficient LLM Training with Online Subspace Descent Paper • 2408.12857 • Published Aug 23, 2024 • 16
Cautious Optimizers: Improving Training with One Line of Code Paper • 2411.16085 • Published Nov 25, 2024 • 20