ProX Refining Models Collection Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 4
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Paper • 2412.02595 • Published Dec 3, 2024 • 5
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Paper • 2412.02595 • Published Dec 3, 2024 • 5 • 21