WindyWord.ai STT β Malayalam Lingua (GPU (safetensors))
Transcribes Malayalam speech (Dravidian > Southern Dravidian).
Note: Quality ceiling: audited at 73.3% WER (community Malayalam Whisper space is thin β best alternative on HuggingFace audited at 76.5%, ~1.5% worse). Source:
vrclc/Whisper-small-Malayalam. For high-stakes Malayalam transcription consideropenai/whisper-large-v3multilingual.
Quality
- FLEURS WER: 73.2% (50-sample audit)
- CER: 0.4872
- Tier: UNUSABLE-GAP β
- Source: WindyWord Grand Rounds v2 audit (50-sample FLEURS)
About this variant
This is the safetensors deployment format of our Malayalam Lingua STT model. Load it via the safetensors/ subfolder.
Part of the WindyWord.ai STT fleet β covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.
Usage
from transformers import WhisperForConditionalGeneration, WhisperProcessor
processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-ml", subfolder="safetensors")
model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-ml", subfolder="safetensors")
Commercial Use
Visit windyword.ai for apps and API access.
Provenance & License
Weights derived from upstream community Whisper fine-tunes (see individual model card for exact lineage). Redistributed under Apache-2.0 (inherited).
Certified by Opus 4.6 Opus-Claw (Dr. C) on Veron-1 (RTX 5090, Mt Pleasant SC).