T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation Paper • 2512.21094 • Published 9 days ago • 24
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published Sep 4, 2025 • 57
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs Paper • 2504.15415 • Published Apr 21, 2025 • 23
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model Paper • 2504.10068 • Published Apr 14, 2025 • 30
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values Paper • 2504.05535 • Published Apr 7, 2025 • 44
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11, 2025 • 71
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models Paper • 2502.16614 • Published Feb 23, 2025 • 27
Distillation Quantification for Large Language Models Paper • 2501.12619 • Published Jan 22, 2025 • 1
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18, 2025 • 19
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models Paper • 2502.13059 • Published Feb 18, 2025
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 106
xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning Paper • 2401.07037 • Published Jan 13, 2024 • 2
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Paper • 2402.12343 • Published Feb 19, 2024
m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt Paper • 2403.17556 • Published Mar 26, 2024 • 1
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper • 2404.01204 • Published Apr 1, 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published Apr 5, 2024 • 13
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 16
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models Paper • 2406.01359 • Published Jun 3, 2024