VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published Feb 7 • 66
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published Feb 6 • 24
QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation Paper • 2502.05178 • Published Feb 7 • 10