Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. β’ 46 items β’ Updated Jul 21 β’ 653
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper β’ 2309.09400 β’ Published Sep 17, 2023 β’ 85
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper β’ 2406.14491 β’ Published Jun 20, 2024 β’ 95
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Paper β’ 2304.01373 β’ Published Apr 3, 2023 β’ 9
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper β’ 2407.13623 β’ Published Jul 18, 2024 β’ 56
Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP Paper β’ 2408.04303 β’ Published Aug 8, 2024 β’ 22
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell β’ Apr 28, 2024 β’ 38