i3-architecture

FlameF0X 's Collections

AURORA

Others

Proto

updated 1 day ago

FlameF0X/i3-200m

Updated 11 days ago • 5

Note Work in progress.
FlameF0X/i3-80m

Text Generation • 82.8M • Updated 2 days ago • 68

Note SOTA model. Pre-trained in around 2 to 4 hours, in comparison with the previous version of over 14 hours. --- Changes --- Trained on over 3T tokens Other stuff available to read in the model card.
FlameF0X/i3-22m

Text Generation • 22.6M • Updated 2 days ago • 94

Note Smol stable text generator that took over 14 hours to pre-train :) --- Changes --- Trained on over 1T tokens LoRPt layers
FlameF0X/i3-12m

Text Generation • 12.7M • Updated 10 days ago • 169 • 1

Note Our first usable i3 model (meaning that we added Transformers support and some code for it)
FlameF0X/i3-tiny

Text Generation • 711k • Updated 16 days ago • 28

Note the first i3 architecture LM