--- license: cc-by-2.0 language: - it --- # Model Card for Model ID This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). ## Model Details The model has 31,536,128 trainable parameters ### Model Description Model trained using Early Exit architecture: 12 conformer layers, 6 CTC decoders. The model has been generated by averaging from epoch 16 to epoch 26. This model can handle only speech signals sampled at 16 kHz. ## Uses To be used for ASR: code for using the model available at https://github.com/SpeechTechLab/early-exit-transformer ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code at https://github.com/SpeechTechLab/early-exit-transformer. ## Training Details decoder_mode='ctc', model_type='early_conformer', bpe=True distill=False, language_model=None, language_model_dict=None, avg_model_start=0, avg_model_end=5 max_len=2000, d_model=256, n_enc_layers_per_exit=2, n_enc_exits=6, n_dec_layers=6, n_heads=8 d_feed_forward=2048, depthwise_kernel_size=31, max_utterance_length=600, sample_rate=16000 n_fft=512, win_length=320, hop_length=160, n_mels=80 src_pad_idx=0, trg_pad_idx=126, trg_sos_idx=1, trg_eos_idx=2, enc_voc_size=256, dec_voc_size=256 sp= Common Voice (Italian) [~410h], MultiLingual LibriSpeech (Italian) [~247h], VoxPopuli (Italian) [~87h], You Tube Commons (Italian) [~1580h] ### Training Procedure 47 epochs on CV followed by 80 epochs on CV+MLS+Voxpopuli followed by 5 epochs on YPT+CV+MLS+Voxpopuli #### Training Hyperparameters shuffle=True, batch_size=64, n_batch_split=8, drop_prob=0.1, init_lr=1e-05, adam_eps=1e-09, weight_decay=0.0001, warmup=[trining dataset size], clip=1.0 #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation (%WER) | MLS | Voxpopuli | CV | |----------- |--------- | ------- | | 17.66 | 19.69 | 19.42 ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure FBK-digis cluster #### Hardware device=device(type='cuda', index=0, CUDA Version: 12.5) GPU quadro RTX50000 #### Software [More Information Needed] ## Citation [optional] G. A. Wright, U. Cappellazzo, S. Zaiem, D. Raj, L. O. Yang, D. Falavigna, M. N. Ali, and A. Brutti, “Training early-exit architectures for automatic speech recognition: Fine-tuning pre-trained models or training from scratch,” in Proc. of ICASSP Workshops, 2024, pp. 685–689 (https://arxiv.org/abs/2309.09546) Maxence Lasbordes, Daniele Falavigna, Alessio Brutti, “Splitformer: An improved early-exit architecture for automatic speech recognition on edge devices”, Proc. of EUSIPCO 2025 (https://arxiv.org/abs/2506.18035) Mohamed Nabih Ali, Alessio Brutti, Daniele Falavigna, Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients. To appear on "Progress in Artificial Intelligence" (https://arxiv.org/abs/2405.17376) **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]