PKU-Alignment/ProgressGym-HistLlama3-8B-C014-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 4
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-instruct-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 20
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 9
PKU-Alignment/beaver-7b-unified-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 294
PKU-Alignment/beaver-7b-unified-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 386 • 2
PKU-Alignment/beaver-7b-v1.0-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 1.7k • 17
PKU-Alignment/beaver-7b-v1.0-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 2.06k • 10