Available Gradio with VAD and diarization π
π I've integrated the Voxtral-mini-3b model into a Whisper-WebUI project! Early tests are impressive: the French transcription quality is significantly better than with standard Whisper models.
I also added compatible VAD and diarization, and removed the audio length limitations.
Curious? Check out the branch here:
https://github.com/OlivierAlbertini/Voxtral-WebUI
You can use this branch https://github.com/OlivierAlbertini/Whisper-WebUI/tree/feature/voxtral
I notice better quality for french transcription
After I run it, it prompts model download error
You can use this branch https://github.com/OlivierAlbertini/Whisper-WebUI/tree/feature/voxtral
I notice better quality for french transcriptionAfter I run it, it prompts model download error
You need to be logged with HF (https://huggingface.co/docs/huggingface_hub/en/guides/cli)
also https://github.com/OlivierAlbertini/Whisper-WebUI/blob/feature/voxtral/VOXTRAL_SETUP.md
You can use this branch https://github.com/OlivierAlbertini/Whisper-WebUI/tree/feature/voxtral
I notice better quality for french transcriptionAfter I run it, it prompts model download error
You need to be logged with HF (https://huggingface.co/docs/huggingface_hub/en/guides/cli)
also https://github.com/OlivierAlbertini/Whisper-WebUI/blob/feature/voxtral/VOXTRAL_SETUP.md
Thanks for your reply. With your help, I successfully installed and ran it, but there is a problem. When I use the Voxtral-Mini-3B-2507 model to transcribe, the subtitles of the transcription result are divided into one sentence every 30 secondsοΌI turned on the VAD function. Is there something wrong?
Console log:
C:\Users\tk199\Voxtral-WebUI-feature-voxtral\venv\Lib\site-packages\ctranslate2_init_.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Use "voxtral-mini" implementation
Device "cuda" is detected
- Running on local URL: http://127.0.0.1:7860
- To create a public link, set
share=True
inlaunch()
.
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2/2 [00:03<00:00, 1.71s/it]
C:\Users\tk199\Voxtral-WebUI-feature-voxtral\modules\whisper\voxtral_whisper_inference.py:357: UserWarning: PySoundFile failed. Trying audioread instead.
audio_data, sr = librosa.load(audio_path, sr=None)
C:\Users\tk199\Voxtral-WebUI-feature-voxtral\venv\Lib\site-packages\librosa\core\audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
Deprecated as of librosa version 0.10.0.
It will be removed in librosa version 1.0.
y, sr_native = __audioread_load(path, offset, duration, dtype)
C:\Users\tk199\Voxtral-WebUI-feature-voxtral\modules\whisper\voxtral_whisper_inference.py:71: UserWarning: PySoundFile failed. Trying audioread instead.
audio_data, sr = librosa.load(audio_path, sr=None)
C:\Users\tk199\Voxtral-WebUI-feature-voxtral\venv\Lib\site-packages\librosa\core\audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
Deprecated as of librosa version 0.10.0.
It will be removed in librosa version 1.0.
y, sr_native = __audioread_load(path, offset, duration, dtype)
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
The following generation flags are not valid and may be ignored: ['temperature']. SetTRANSFORMERS_VERBOSITY=info
for more details.
Repository Not Found for url: https://huggingface.co/api/models/voxtral-mini-3b/revision/main.
Please make sure you specified the correct repo_id
and repo_type
.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication