Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling Paper • 2509.00605 • Published 12 days ago • 41 • 5
On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective Paper • 2507.23632 • Published Jul 31 • 6 • 2
Cottention: Linear Transformers With Cosine Attention Paper • 2409.18747 • Published Sep 27, 2024 • 17 • 5
Cottention: Linear Transformers With Cosine Attention Paper • 2409.18747 • Published Sep 27, 2024 • 17 • 5