File size: 903 Bytes
4e9f81f fd5a917 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
---
library_name: transformers
tags: []
---
# Audio segmentation powered by speaker diarization
```bash
git clone https://github.com/nguyenvulebinh/audio-seg-diarization.git
cd audio-seg-diarization && pip install -r requirements.txt
```
```python
from src.pyanet.pyanet_model import PyanNet
from src.utils import segmentor
import torch
import torchaudio
segmentation_model = PyanNet.from_pretrained("nguyenvulebinh/audio-seg-diarization").eval()
if torch.cuda.is_available():
segmentation_model = segmentation_model.cuda()
wav_path = "./resource/example.wav"
wav, rate = torchaudio.load(wav_path)
segments = segmentor(segmentation_model, wav, max_duration=25)
# [{'start': 9568.527218750001, 'end': 9572.66159375, 'segments': [(9568.527218750001, 9572.66159375)]}]
segments_wavs = [wav[0, int(seg['start'] * rate):int(seg['end'] * rate)] for seg in segments]
``` |