The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published 16 days ago • 62
SiMilarity-Enhanced Homophily for Multi-View Heterophilous Graph Clustering Paper • 2410.03596 • Published Oct 4, 2024
TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration Paper • 2506.08403 • Published Jun 10
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models Paper • 2506.10100 • Published Jun 11 • 10
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published May 25 • 146
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published Mar 19 • 20
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More Paper • 2502.11494 • Published Feb 17
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published Mar 19 • 21
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 24
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5, 2024 • 57