Residual Energy-Based Models for Text Generation Paper โข 2004.11714 โข Published Apr 22, 2020 โข 2
Energy-Based Diffusion Language Models for Text Generation Paper โข 2410.21357 โข Published Oct 28, 2024 โข 3
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper โข 2511.06221 โข Published 11 days ago โข 108
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper โข 2511.04962 โข Published 13 days ago โข 50
Hitchhiker's guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics Paper โข 2406.13661 โข Published Jun 19, 2024 โข 1
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper โข 2505.14669 โข Published May 20 โข 78
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms Paper โข 2511.04217 โข Published 14 days ago โข 15
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper โข 2511.04570 โข Published 14 days ago โข 194
Diffusion Language Models are Super Data Learners Paper โข 2511.03276 โข Published 15 days ago โข 116