Yehor
/

w2v-bert-uk-v2.1-fp16

Automatic Speech Recognition

Model card Files Files and versions

Yehor commited on Mar 30

Commit

873e61f

·

verified ·

1 Parent(s): c4d3ae4

Update README.md

Files changed (1) hide show

README.md +1 -70

README.md CHANGED Viewed

@@ -46,74 +46,5 @@ See other Ukrainian models: https://github.com/egorsmkv/speech-recognition-uk
 ## Overview
-This is a next model of https://huggingface.co/Yehor/w2v-bert-uk
-## Metrics
-- AM (F16):
-  - WER: 0.1734 metric, 17.34%
-  - CER: 0.0333 metric, 3.33%
-  - Accuracy on words: 82.66%
-  - Accuracy on chars: 96.67%
-## Demo
-Use https://huggingface.co/spaces/Yehor/w2v-bert-uk-v2.1-demo space to see how the model works with your audios.
-## Usage
-```python
-# pip install -U torch soundfile transformers
-import torch
-import soundfile as sf
-from transformers import AutoModelForCTC, Wav2Vec2BertProcessor
-# Config
-model_name = 'Yehor/w2v-bert-uk-v2.1'
-device = 'cuda:0' # or cpu
-torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
-sampling_rate = 16_000
-# Load the model
-asr_model = AutoModelForCTC.from_pretrained(model_name, torch_dtype=torch_dtype).to(device)
-processor = Wav2Vec2BertProcessor.from_pretrained(model_name)
-paths = [
-  'sample1.wav',
-]
-# Extract audio
-audio_inputs = []
-for path in paths:
-  audio_input, _ = sf.read(path)
-  audio_inputs.append(audio_input)
-# Transcribe the audio
-inputs = processor(audio_inputs, sampling_rate=sampling_rate).input_features
-features = torch.tensor(inputs).to(device)
-with torch.inference_mode():
-  logits = asr_model(features).logits
-predicted_ids = torch.argmax(logits, dim=-1)
-predictions = processor.batch_decode(predicted_ids)
-# Log results
-print('Predictions:')
-print(predictions)
-```
-## Cite this work
-```
-@misc {smoliakov_2025,
-	author       = { {Smoliakov} },
-	title        = { w2v-bert-uk-v2.1 (Revision 094c59d) },
-	year         = 2025,
-	url          = { https://huggingface.co/Yehor/w2v-bert-uk-v2.1 },
-	doi          = { 10.57967/hf/4554 },
-	publisher    = { Hugging Face }
-}
-```


46
47	## Overview
48
49	+ This is the model - https://huggingface.co/Yehor/w2v-bert-uk-v2.1 - where tensors are saved in fp16 format.
50