EgypTalk-ASR-v2

NAMAA-Space/EgypTalk-ASR-v2 is a high-performance automatic speech recognition (ASR) model for Egyptian Arabic, trained using NVIDIA NeMo and optimized for real-world speech from native Egyptian speakers.

The model was trained on over 200 hours of high-quality, manually curated audio data collected and prepared by the NAMAA team. It is built upon NVIDIA’s FastConformer Hybrid Large architecture and fine-tuned for Egyptian Arabic, enabling highly accurate transcription in casual, formal, and mixed dialect settings.

Demo: Try it here

🗣️ Model Description

Architecture: FastConformer Hybrid Large from NVIDIA NeMo ASR collection.
Framework: PyTorch Lightning + NVIDIA NeMo.
Languages: Egyptian Arabic (with capability to handle some Modern Standard Arabic).
Dataset: 200+ hours of proprietary, high-quality audio for Egyptian Arabic, covering:
- Spontaneous conversation
- Broadcast media
- Interviews
- Read speech
Tokenizer: SentencePiece (trained specifically for Egyptian Arabic phonetic coverage).
Input Format: 16 kHz mono WAV files.
Output: Raw transcribed text in Arabic.

🚀 Key Features

Egyptian Arabic Dialect Optimized – Designed to handle local pronunciations, colloquialisms, and speech patterns.
High Accuracy – Achieves strong WER performance on Egyptian test sets.
FastConformer Efficiency – Low-latency, streaming-capable inference.
Robust Dataset – Covers multiple domains (media, conversation, formal speech).

💻 Usage

import torch
from nemo.collections.asr.models import ASRModel

# Load model
model = ASRModel.from_pretrained("NAMAA-Space/EgypTalk-ASR-v2")

# Transcribe audio
transcription = model.transcribe(["sample.wav"])
print(transcription)

🛠️ Training Details

Pretrained Base Model: nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0
Training Framework: PyTorch Lightning (DDP strategy)
Training Duration: 100 epochs, mixed precision enabled
Optimizer: Adam with learning rate 1e-3
Batch Size: 32 (train) / 8 (validation, test)
Augmentations: Silence trimming, start/end token usage

Citation

@misc{,
  title={NAMAA-Space/EgypTalk-ASR-v2},
  author={NAMAA},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/NAMAA-Space/NAMAA-Space/EgypTalk-ASR-v2}},
  note={Accessed: 2025-03-02}
}

Downloads last month: 230

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for NAMAA-Space/EgypTalk-ASR-v2

Base model

nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0

Finetuned

(1)

this model

NAMAA-Space
/

EgypTalk-ASR-v2