CLAP: Contrastive Language-Audio Pretraining Collection CLAP is to audio what CLIP is to image. • 5 items • Updated Oct 31, 2023 • 12
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts Paper • 2505.18962 • Published May 25 • 12
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17 • 41
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 106
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 300
Running 2.85k 2.85k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 161
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39
Wizard Models Collection Replica of the official repository for research purposes • 6 items • Updated Jun 20, 2024 • 1
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19, 2024 • 46
Running on CPU Upgrade 13.4k 13.4k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
VCR: Visual Caption Restoration Collection All configurations for VCR: Visual Caption Restoration (arXiv:2406.06462). • 8 items • Updated Jul 31, 2024 • 2
VCR: Visual Caption Restoration (Smaller Test Subsets) Collection This space contains smaller test subsets (first 100 / first 500) of all VCR-Wiki configurations. • 8 items • Updated Jun 11, 2024 • 2
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 820