LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation Paper • 2510.22946 • Published 8 days ago • 16
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 21 days ago • 169
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26 • 179
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 134
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video Paper • 2411.18671 • Published Nov 27, 2024 • 20
lmms-lab/llava-next-interleave-qwen-0.5b Text Generation • 0.9B • Updated Jul 12, 2024 • 350 • 12
lmms-lab/llava-next-interleave-qwen-7b-dpo Text Generation • 8B • Updated Jul 12, 2024 • 127 • 12
lmms-lab/llava-next-interleave-qwen-7b Text Generation • 8B • Updated Jul 24, 2024 • 123 • 27
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models Paper • 2407.07895 • Published Jul 10, 2024 • 42