view article Article Introducing ColQwen-Omni: Retrieve in every modality By manu and 4 others • 10 days ago • 58
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • 19 days ago • 611
view article Article We're open-sourcing "The Amazing Hand", a fully 3D printed robotic hand for less than $200 ✌️✌️✌️ By pollen-robotics and 2 others • 19 days ago • 33
view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • 19 days ago • 27
view article Article LLM Hallucinations: bug or feature? The US Supreme Court 2025 cases experiment By dvilasuero • 19 days ago • 18
Training data for Swedish Lion Libre Collection This collection groups together the publically available training data used in creating our set of models for HTR: Swedish Lion Libre. • 11 items • Updated Jan 14 • 1
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 By tomaarsen and 1 other • 27 days ago • 105
view article Article Teaching Data Literacy with Hugging Face's AI Sheets By ParulPandey • 28 days ago • 23
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • Jun 21 • 66
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 75
Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability Paper • 2506.08300 • Published Jun 10 • 8
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 24 days ago • 110
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 44