Long-Context Attention Benchmark: From Kernel Efficiency to Distributed Context Parallelism Paper • 2510.17896 • Published Oct 19 • 4
Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference Paper • 2510.18413 • Published Oct 21 • 4
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models Paper • 2405.13053 • Published May 19, 2024 • 1
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey Paper • 2311.12351 • Published Nov 21, 2023 • 5
Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines Paper • 2410.07896 • Published Oct 10, 2024 • 2