view article Article Improving Parquet Dedupe on Hugging Face Hub By yuchenglow and 1 other β’ Oct 5, 2024 β’ 38
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? By orrzohar and 3 others β’ 9 days ago β’ 30
view article Article Fast LoRA inference for Flux with Diffusers and PEFT By sayakpaul and 1 other β’ 9 days ago β’ 36
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 β’ 19 items β’ Updated Apr 1, 2024 β’ 16
π Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets β’ 8 items β’ Updated Jun 12, 2024 β’ 40
FineWeb2 Edu Japanese Collection FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training. β’ 7 items β’ Updated Jun 19 β’ 1
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others β’ Dec 23, 2024 β’ 21
Seed-X Collection A powerful open-source multilingual translation language model series, including instruction and reasoning models. β’ 6 items β’ Updated 3 days ago β’ 60
view article Article Seq vs Seq: the Ettin Suite of Paired Encoders and Decoders By orionweller and 5 others β’ 16 days ago β’ 50
πSmall-Doges Collection Doge family of small language models! β’ 18 items β’ Updated Apr 21 β’ 8
π LLM pretraining datasets Collection A collection of datasets for LLM pretraining β’ 9 items β’ Updated May 5 β’ 10