view article Article SyGra: The One-Stop Framework for Building Data for LLMs and SLMs By ServiceNow-AI and 3 others • 7 days ago • 9
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
RepoFusion: Training Code Models to Understand Your Repository Paper • 2306.10998 • Published Jun 19, 2023 • 13