Cost-Optimal GQA Models chen-yingfa/cogqa-3m Updated Sep 14, 2025 chen-yingfa/cogqa-19m Updated Sep 14, 2025
Long Context Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12, 2025 • 5
Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12, 2025 • 5
MLP BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Paper • 2507.08771 • Published Jul 11, 2025 • 9 Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4, 2024 • 11
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Paper • 2507.08771 • Published Jul 11, 2025 • 9
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4, 2024 • 11
RNN Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2
Cost-Optimal GQA Models chen-yingfa/cogqa-3m Updated Sep 14, 2025 chen-yingfa/cogqa-19m Updated Sep 14, 2025
MLP BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Paper • 2507.08771 • Published Jul 11, 2025 • 9 Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4, 2024 • 11
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Paper • 2507.08771 • Published Jul 11, 2025 • 9
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published Nov 4, 2024 • 11
Long Context Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12, 2025 • 5
Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12, 2025 • 5
RNN Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2