Moshi: a speech-text foundation model for real-time dialogue Paper • 2410.00037 • Published Sep 17, 2024 • 6
Reinforced Self-Training (ReST) for Language Modeling Paper • 2308.08998 • Published Aug 17, 2023 • 3
Efficiently Modeling Long Sequences with Structured State Spaces Paper • 2111.00396 • Published Oct 31, 2021 • 3
Hyena Hierarchy: Towards Larger Convolutional Language Models Paper • 2302.10866 • Published Feb 21, 2023 • 7 • 4
Hyena Hierarchy: Towards Larger Convolutional Language Models Paper • 2302.10866 • Published Feb 21, 2023 • 7
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172 • 34
Hyena Hierarchy: Towards Larger Convolutional Language Models Paper • 2302.10866 • Published Feb 21, 2023 • 7 • 4
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172 • 34
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172
Improving language models by retrieving from trillions of tokens Paper • 2112.04426 • Published Dec 8, 2021 • 1
Unlimiformer: Long-Range Transformers with Unlimited Length Input Paper • 2305.01625 • Published May 2, 2023 • 6 • 4
Unlimiformer: Long-Range Transformers with Unlimited Length Input Paper • 2305.01625 • Published May 2, 2023 • 6
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172 • 34