SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 69
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 195
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning Paper • 2410.14208 • Published Oct 18, 2024 • 3
Teaching Models to Balance Resisting and Accepting Persuasion Paper • 2410.14596 • Published Oct 18, 2024 • 3
How Do Training Methods Influence the Utilization of Vision Models? Paper • 2410.14470 • Published Oct 18, 2024 • 5
Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media Paper • 2410.12791 • Published Oct 16, 2024 • 5
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement Paper • 2410.13828 • Published Oct 17, 2024 • 4
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments Paper • 2410.11331 • Published Oct 15, 2024 • 8
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities Paper • 2410.14672 • Published Oct 18, 2024 • 8
Looking Inward: Language Models Can Learn About Themselves by Introspection Paper • 2410.13787 • Published Oct 17, 2024 • 8
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts Paper • 2410.14677 • Published Oct 18, 2024 • 12
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Paper • 2410.13726 • Published Oct 17, 2024 • 12
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion Paper • 2410.13674 • Published Oct 17, 2024 • 17
Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities Paper • 2410.11190 • Published Oct 15, 2024 • 22
DPLM-2: A Multimodal Diffusion Protein Language Model Paper • 2410.13782 • Published Oct 17, 2024 • 22
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Paper • 2410.10812 • Published Oct 14, 2024 • 18
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models Paper • 2410.13370 • Published Oct 17, 2024 • 38
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs Paper • 2410.13276 • Published Oct 17, 2024 • 30