view post Post 4057 I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore. See translation
view post Post 504 the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus. See translation
i3-Series Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the botSeries FlameF0X/i3-tiny Text Generation • 711k • Updated Oct 17 • 37 • 1 FlameF0X/i3-12m Text Generation • 12.7M • Updated Oct 23 • 68 • 3 FlameF0X/i3-22m Text Generation • 22.6M • Updated Oct 31 • 21 • 2 FlameF0X/i3-80m Text Generation • 82.8M • Updated 26 days ago • 33 • 7
i3-Nano FlameF0X/i3-BERT Updated 4 days ago FlameF0X/i3-BERT-v2 Fill-Mask • Updated 41 minutes ago • 27 FlameF0X/i3-CLIP Zero-Shot Classification • Updated about 1 hour ago
i3-Series Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the botSeries FlameF0X/i3-tiny Text Generation • 711k • Updated Oct 17 • 37 • 1 FlameF0X/i3-12m Text Generation • 12.7M • Updated Oct 23 • 68 • 3 FlameF0X/i3-22m Text Generation • 22.6M • Updated Oct 31 • 21 • 2 FlameF0X/i3-80m Text Generation • 82.8M • Updated 26 days ago • 33 • 7
i3-Nano FlameF0X/i3-BERT Updated 4 days ago FlameF0X/i3-BERT-v2 Fill-Mask • Updated 41 minutes ago • 27 FlameF0X/i3-CLIP Zero-Shot Classification • Updated about 1 hour ago