GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement Paper • 2406.11546 • Published Jun 17, 2024 • 1
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder Paper • 2409.14074 • Published Sep 21, 2024 • 2
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval Paper • 2506.18902 • Published Jun 23 • 9
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Paper • 2506.00338 • Published May 31 • 10