Arabic Speech Datasets Collection Best Datasets for Arabic Speech Tasks • 16 items • Updated about 24 hours ago • 13
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language +4 Dec 16, 2024 • 152
PaddleOCR-VL Collection Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model • 3 items • Updated 17 days ago • 23
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 435
LLMDet Collection LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models • 5 items • Updated Jul 26, 2025 • 3
view article Article Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub Jun 27, 2025 • 30
view article Article ScreenSuite - The most comprehensive evaluation suite for GUI Agents! +1 Jun 6, 2025 • 55
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H Jun 3, 2025 • 71
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11, 2025 • 369
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 16 items • Updated 9 days ago • 26
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20, 2025 • 96