view article Article Transformers backend integration in SGLang By marcsun13 and 4 others β’ Jun 23 β’ 49
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others β’ Dec 19, 2024 β’ 671
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others β’ Jun 3 β’ 79
view article Article NVIDIA Cosmos Now Available On Hugging Face For Physical AI Reasoning By PranjaliJoshi and 1 other β’ May 19 β’ 25
view article Article Page-to-Video: Generate videos from webpages πͺπ¬ By burtenshaw β’ May 6 β’ 27
view article Article Ο0 and Ο0-FAST: Vision-Language-Action Models for General Robot Control By danaaubakirova and 3 others β’ Feb 4 β’ 167
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ Jan 23 β’ 69
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding Paper β’ 2410.17434 β’ Published Oct 22, 2024 β’ 30
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra β’ Jun 11, 2024 β’ 18
Molmo Collection Artifacts for open multimodal language models. β’ 5 items β’ Updated Apr 30 β’ 305
view article Article Key Insights into the Law of Vision Representations in MLLMs By Borise β’ Sep 2, 2024 β’ 18
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper β’ 2409.04593 β’ Published Sep 6, 2024 β’ 27
Vision Language Models Papers πΌοΈπ¬π Collection Papers about vision-language models, most important ones are on top of the list. β’ 27 items β’ Updated Apr 30, 2024 β’ 38