MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation Paper • 2406.07529 • Published Jun 11, 2024
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29 • 219