nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 4.38k • • 2.06k
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU Paper • 2312.12456 • Published Dec 16, 2023 • 44
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 148 • 12
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 148