Improving Recursive Transformers with Mixture of LoRAs Paper • 2512.12880 • Published 10 days ago • 4
DCAgent2/claude-4-5-sonnet-thinking-stackexchange-overflow-32ep-32k-traces Viewer • Updated 17 days ago • 3.77k • 62 • 1
SPICE: Self-Play In Corpus Environments Improves Reasoning Paper • 2510.24684 • Published Oct 28 • 17
SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations Paper • 2512.14080 • Published 9 days ago • 5
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers Paper • 2512.16615 • Published 6 days ago • 4