whisper-timestamped-cs

Table of Contents

Click to expand

Summary

The "whisper-timestamped-cs" is an acoustic model based on "openai/whisper-large-v3" suitable for Automatic Speech Recognition in code-switching conditions between Spanish and Catalan.

Model Description

The "whisper-timestamped-cs" is an acoustic model suitable for Automatic Speech Recognition in code-switching conditions between Spanish and Catalan. It is the result of finetuning the model "openai/whisper-large-v3" with 2 hours of synthetic code-switching data in Spanish/Catalan generated by the Projecte AINA from Barcelona, Spain.

Intended Uses and Limitations

This model can be used for Automatic Speech Recognition (ASR) in code-switching conditions between Spanish and Catalan. The model is intended to transcribe audio files to plain text.

Installation

To use this model, you may install whisper-timestamped:

Create a virtual environment:

python -m venv /path/to/venv

Activate the environment:

source /path/to/venv/bin/activate

Install the modules:

pip install git+https://github.com/linto-ai/whisper-timestamped

For Inference

To transcribe audio in code-switching using this model, you can follow this example:

import whisper_timestamped as whisper

model = whisper.load_model("langtech-veu/whisper-timestamped-cs", device="cpu")
result = whisper.transcribe(model, "/path/to/the/audio.wav")

import json
print(json.dumps(result, indent = 2, ensure_ascii = False))

Training Details

Training data

The specific dataset used to create the model is a corpus called CAESAR-tiny, which has not been released at the moment.

Citation

If this model contributes to your research, please cite the work:

@misc{BSC2025whispertimestampedcs,
      title={ASR models for Catalan and Spanish CS: whisper-timestamped-cs.}, 
      author={Takanori, Lucas; Solito, Sarah; Messaoudi, Abir; España i Bonet, Cristina},
      organization={Barcelona Supercomputing Center},
      url={https://huggingface.co/langtech-veu/whisper-timestamped-cs},
      year={2025}
}

Additional Information

Author

The fine-tuning process was performed during 2025 in the Language Technologies Laboratory of the Barcelona Supercomputing Center.

Contact

For further information, please email [email protected].

Copyright

Copyright(c) 2025 by Language Technologies Laboratory, Barcelona Supercomputing Center.

License

Apache-2.0

Funding

This work has been promoted and financed by the Generalitat de Catalunya through the Aina project.

The training of the model was possible thanks to the computing time provided by Barcelona Supercomputing Center through MareNostrum 5.

Downloads last month
15
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for langtech-veu/whisper-timestamped-cs

Finetuned
(643)
this model

Spaces using langtech-veu/whisper-timestamped-cs 2