Typhoon-isan-asr-realtime
Typhoon Isan ASR Realtime is a specialized, fine-tuned version of the Typhoon ASR Realtime model, optimized specifically for the Isan dialect of the Thai language. Built for real-world streaming applications, it delivers fast and accurate transcriptions of Isan speech while running efficiently on standard CPUs. This enables users to host their own ASR service for Isan dialect recognition, reducing costs and avoiding the need to send sensitive data to third-party cloud services.
The model is based on NVIDIA's FastConformer Transducer model, which is optimized for low-latency, real-time performance.
Try our demo available on Demo
Code / Examples available on Github
Release Blog available on OpenTyphoon Blog
Performance
Note on Baseline: The scb10x/whisper-medium-slscu-nectec included in the comparison is a model we fine-tuned specifically for this benchmark using existing dialect data from NECTEC and SLSCU. It serves as a representative baseline for performance based on public data, distinct from the SLSCU_korat_model (the prominent previous work for Isan dialect ASR). This helps to determine the clear gap between capabilities derived from previously available resources and the new Typhoon Isan ASR.
Key Findings
- Comparable to State-of-the-Art Proprietary Models: The Typhoon Isan ASR family (both the Whisper-based and Realtime variants) demonstrates performance highly competitive with Gemini-2.5-pro. This validates that specialized open models can match or exceed the capabilities of large-scale proprietary multimodal systems for dialectal speech recognition.
- Leading Performance: The
typhoon-whisper-medium-isan-asrmodel achieved the lowest error rate in the benchmark (0.0885), outperforming Gemini-2.5-pro (0.1020) by a clear margin. - Consistency Across Architectures: The
typhoon-isan-asr-realtimemodel follows closely with a CER of 0.1065. The difference between this model and Gemini-2.5-pro is negligible (< 0.5%), indicating that users can rely on the Typhoon suite for both high-accuracy offline transcription and latency-sensitive realtime applications without compromising on quality compared to commercial APIs.
Follow us
https://twitter.com/opentyphoon
Support
- Downloads last month
- 162
Model tree for scb10x/typhoon-isan-asr-realtime
Base model
nvidia/stt_en_fastconformer_transducer_large