Building on HF

4 16 4

Kitsun

KitsuVp

AI & ML interests

None yet

Recent Activity

updated a model about 9 hours ago

KitsuVp/NeoLLM

upvoted a paper 1 day ago

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse

published a model 6 days ago

KitsuVp/NeoLLM-Instruct

View all activity

Organizations

None yet

updated a model about 9 hours ago

KitsuVp/NeoLLM

0.1B • Updated about 2 hours ago • 1.54k • 1

upvoted a paper 1 day ago

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse

Paper • 2512.14531 • Published Dec 16, 2025 • 16

published a model 6 days ago

KitsuVp/NeoLLM-Instruct

Updated 6 days ago

upvoted a paper 7 days ago

Exclusive Self Attention

Paper • 2603.09078 • Published Mar 10 • 4

upvoted 2 papers 18 days ago

LUCID: Attention with Preconditioned Representations

Paper • 2602.10410 • Published Feb 11 • 1

TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies

Paper • 2511.23225 • Published Nov 28, 2025 • 1

upvoted a paper 20 days ago

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published Jan 29 • 11

upvoted a paper about 2 months ago

Momentum Attention: The Physics of In-Context Learning and Spectral Forensics for Mechanistic Interpretability

Paper • 2602.04902 • Published Feb 3 • 1

liked a model about 2 months ago

chen-hao-chao/mdm-prime-v2-c4

Text Generation • Updated Mar 20 • 2

liked a dataset 5 months ago

LLM360/MegaMath

Viewer • Updated Apr 9, 2025 • 217M • 40.2k • 120

upvoted 3 papers 6 months ago

Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training

Paper • 2511.01918 • Published Nov 1, 2025 • 13

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 80

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 91

upvoted a paper 7 months ago

SuperBPE: Space Travel for Language Models

Paper • 2503.13423 • Published Mar 17, 2025 • 13

commented a paper 7 months ago

Hyper-Connections

Paper • 2409.19606 • Published Sep 29, 2024 • 27 •

updated a model 7 months ago

KitsuVp/NeoLLM

0.1B • Updated about 2 hours ago • 1.54k • 1

commented a paper 7 months ago

Cautious Weight Decay

Paper • 2510.12402 • Published Oct 14, 2025 • 10 •

upvoted a paper 7 months ago

Cautious Weight Decay

Paper • 2510.12402 • Published Oct 14, 2025 • 10

commented a paper 7 months ago

Cautious Weight Decay

Paper • 2510.12402 • Published Oct 14, 2025 • 10 •

published a model 7 months ago

KitsuVp/NeoFanLLM

Updated Oct 11, 2025

Kitsun

AI & ML interests

Recent Activity

Organizations

KitsuVp's activity