view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 327
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner Paper • 2506.09003 • Published Jun 10 • 19
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis Paper • 2501.04561 • Published Jan 8 • 16
CLaSp: In-Context Layer Skip for Self-Speculative Decoding Paper • 2505.24196 • Published May 30 • 13
CLaSp: In-Context Layer Skip for Self-Speculative Decoding Paper • 2505.24196 • Published May 30 • 13
CLaSp: In-Context Layer Skip for Self-Speculative Decoding Paper • 2505.24196 • Published May 30 • 13 • 6
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents Paper • 2505.02156 • Published May 4 • 18
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis Paper • 2501.04561 • Published Jan 8 • 16
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published Dec 16, 2024 • 59
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 38
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Paper • 2409.18943 • Published Sep 27, 2024 • 30
view article Article A failed experiment: Infini-Attention, and why we should keep trying? By neuralink and 2 others • Aug 14, 2024 • 68
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9, 2024 • 49
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9, 2024 • 49
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models Paper • 2405.17915 • Published May 28, 2024 • 2
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Paper • 2405.15232 • Published May 24, 2024 • 3
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models Paper • 2401.06951 • Published Jan 13, 2024 • 27