Model Card for Model ID


language: lo tags: - audio - automatic-speech-recognition - wav2vec2 - lao license: apache-2.0 model-index: - name: Wav2Vec2 Lao Fine-tuned results: []

Wav2Vec2 Lao Fine-tuned

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on Lao speech data. It was trained for automatic speech recognition (ASR) using SiangLao or similar datasets.

Intended Use

  • Lao language ASR tasks
  • Research in low-resource language modeling

Training Details

  • Base model: facebook/wav2vec2-xls-r-300m
  • Framework: Hugging Face Transformers
  • Fine-tuned on: Lao speech dataset
  • Tokenizer and processor: see wav2vec2-lao-processor

How to Use

from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
import torch
import torchaudio

processor = Wav2Vec2Processor.from_pretrained("YourUsername/wav2vec2-lao-processor")
model = Wav2Vec2ForCTC.from_pretrained("YourUsername/wav2vec2-lao-finetuned")

# Load audio
waveform, sample_rate = torchaudio.load("your_audio.wav")

# Preprocess
inputs = processor(waveform.squeeze(), sampling_rate=sample_rate, return_tensors="pt", padding=True)
with torch.no_grad():
    logits = model(**inputs).logits

# Decode
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)
Downloads last month
4
Safetensors
Model size
315M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support