TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity!
YASH AKHAURI
akhauriyash
AI & ML interests
None yet
Recent Activity
updated
a model
12 days ago
akhauriyash/DDR1_Q1.5B-GRPOFixReward
published
a model
12 days ago
akhauriyash/DDR1_Q1.5B-GRPOFixReward
updated
a model
12 days ago
akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-E2EGRPO-OpenR1_Math_SpecR_GRPO_Mini-MiniSet
Organizations
None yet