view post Post 3940 I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore. See translation
view post Post 385 the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus. See translation
i3-architecture FlameF0X/i3-80m Text Generation • 82.8M • Updated 1 day ago • 136 FlameF0X/i3-22m Text Generation • 22.6M • Updated 9 days ago • 96 • 1 FlameF0X/i3-12m Text Generation • 12.7M • Updated 16 days ago • 171 • 1 FlameF0X/i3-tiny Text Generation • 711k • Updated 23 days ago • 30
Reinforcement Learning All the RL agent i made FlameF0X/o2 Reinforcement Learning • Updated Jul 10 FlameF0X/CanoPy Reinforcement Learning • Updated Sep 5
i3-architecture FlameF0X/i3-80m Text Generation • 82.8M • Updated 1 day ago • 136 FlameF0X/i3-22m Text Generation • 22.6M • Updated 9 days ago • 96 • 1 FlameF0X/i3-12m Text Generation • 12.7M • Updated 16 days ago • 171 • 1 FlameF0X/i3-tiny Text Generation • 711k • Updated 23 days ago • 30
Reinforcement Learning All the RL agent i made FlameF0X/o2 Reinforcement Learning • Updated Jul 10 FlameF0X/CanoPy Reinforcement Learning • Updated Sep 5