The Arabic transcription model does not provide diacritics

#4
by YoussefHosni - opened

The model does not transcribe the diacritics anymore. I have tried the following

import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecHybridRNNTCTCBPEModel.from_pretrained(model_name="nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0")
output = asr_model.transcribe(['/content/001 Al-Fatihah alfath.wav'])
print(output[0].text)

ุฃุนูˆุฐ ุจุงู„ู„ู‡ ู…ู† ุงู„ุดูŠุทุงู† ุงู„ุฑุฌูŠู… ุจุณู… ุงู„ู„ู‡ ุงู„ุฑุญู…ู† ุงู„ุฑุญูŠู… ุงู„ุญู…ุฏ ู„ู„ู‡ ุฑุจ ุงู„ุนุงู„ู…ูŠู† ุงู„ุฑุญู…ู† ุงู„ุฑุญูŠู… ู…ุงู„ูƒ ูŠูˆู… ุงู„ุฏูŠู† ุฅูŠุงูƒ ู†ุนุจุฏ ูˆุฅูŠุงูƒ ู†ุณุชุนูŠู† ุงู‡ุฏู†ุง ุงู„ุตุฑุงุท ุงู„ู…ุณุชู‚ูŠู… ุตุฑุงุท ุงู„ุฐูŠู† ุฃู†ุนู…ุช ุนู„ูŠู‡ู… ุบูŠุฑ ุงู„ู…ุบุถูˆุจ ุนู„ูŠู‡ู… ูˆู„ุง ุงู„ุถุงู„ูŠู†

what do you mean it doesnt anymore? It gave diacritics before? and whats the update now?

Sign up or log in to comment