π§ Quasar-V4-Tiny (Base)
Model ID: silx-ai/Quasar-V4-Tiny
Architecture: Linear Attention with Kernel Feature Maps
Developed by: SILX AI
Powered by: gputrader.io
π Description
Quasar-V4-Tiny
is a minimal, experimental language model designed to test a new Linear Attention mechanism using Kernel Feature Maps.
This model discards traditional softmax-based self-attention in favor of a more efficient, scalable alternative.
It represents the first fully working prototype of the Quasar architecture and is trained on a small-scale dataset for initial validation of functionality and tokenization.
π Training Details
- Training objective: Causal Language Modeling (next-token prediction)
- Training tokens: ~1β2 billion
- Architecture: Linear Attention with Kernel Feature Maps
- Batch size: Small, due to limited compute
- Training duration: Short, meant to verify architecture behavior and convergence
β οΈ Limitations
- Not trained for quality or coherence β purely experimental
- Likely to hallucinate, generate irrelevant text, or be inconsistent
- Do not use in production β this is a base model meant for architecture-level debugging and early development
π Acknowledgements
This project was made possible thanks to compute provided by gputrader.io.
Their support enabled fast iteration during early-stage experimentation.
π¬ Research Goals
This model is part of an ongoing effort to:
- Replace traditional transformer attention with linear, scalable attention
- Build more efficient foundation models under constrained resources
- Explore custom architectures that can be trained with minimal GPU power
More versions (medium, scaled, improved) are expected after full validation of the Quasar pipeline.
π License
This model is released for research and testing purposes only.
- Downloads last month
- 4