Whisper-WebUI Premium - Ultra Fast and High Accuracy Speech to Text Transcripton App for All Languages - Windows, RunPod, Massed Compute 1-Click Installers - Supporting RTX 1000 to 5000 series

#222
by MonsterMMORPG - opened

Whisper-WebUI Premium made for SECourses Patreon followers only : https://www.patreon.com/posts/145395299

Whisper-WebUI Premium - Ultra Fast and High Accuracy Speech to Text Transcripton App for All Languages - Windows, RunPod, Massed Compute 1-Click Installers - Supporting RTX 1000 to 5000 series

Download Installers and App

https://www.patreon.com/posts/145395299

Features

  • It has better interface, more features, default settings set for maximum accuracy

  • It will show transcription realtime both on Gradio interface and also on CMD

  • It will show better status and output at the cmd like starting time, starting file, etc

  • It will save every generated transcription properly with same name as input file name with proper name sanitization

  • After deep scan of the entire pipeline, default parameters are set for maximum accuracy and quality

  • Supports both audio and video upload to transcribe ultra fast

  • 1-Click installers for Windows local PC, RunPod (Linux-Cloud) and Massed Compute (Linux-Cloud)

  • The app the installers are made for RTX 1000 series to RTX 5000 series with pre-compiled libraries

  • We install with Torch 2.8, CUDA 12.9, latest Flash Attention, Sage Attention, xFormers - all precompiled

  • As low as 6 GB VRAM GPUs can use

  • OpenAI Whisper Supported Models (auto downloaded into models sub folder):

    • tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2, large-v3, large, large-v3-turbo, turbo
  • Distil-Whisper Supported Models (Faster-Whisper & Insanely-Fast-Whisper - (auto downloaded into models sub folder)):

    • distil-large-v2, distil-large-v3, distil-medium.en, distil-small.en
  • Supported transcription output formats

    • SRT (SubRip) - .srt, VTT/WebVTT (Web Video Text Tracks) - .vtt, TXT (Plain Text) - .txt
    • LRC (Lyrics File) - .lrc, JSON - .json, TSV (Tab-Separated Values) - .tsv
  • Batch folder processing and multiple output formats at once

  • Supported languages for Audio to Text transcription is as below

    • Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba

Screenshots

1
2
3
4
5
6
7
8

Sign up or log in to comment