File size: 903 Bytes
4e9f81f
 
 
 
 
fd5a917
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
library_name: transformers
tags: []
---

# Audio segmentation powered by speaker diarization

```bash
git clone https://github.com/nguyenvulebinh/audio-seg-diarization.git
cd audio-seg-diarization && pip install -r requirements.txt
```

```python
from src.pyanet.pyanet_model import PyanNet
from src.utils import segmentor
import torch
import torchaudio

segmentation_model = PyanNet.from_pretrained("nguyenvulebinh/audio-seg-diarization").eval()
    
if torch.cuda.is_available():
    segmentation_model = segmentation_model.cuda()

wav_path = "./resource/example.wav"        
wav, rate = torchaudio.load(wav_path)
    
segments = segmentor(segmentation_model, wav, max_duration=25)
            
# [{'start': 9568.527218750001, 'end': 9572.66159375, 'segments': [(9568.527218750001, 9572.66159375)]}]
segments_wavs = [wav[0, int(seg['start'] * rate):int(seg['end'] * rate)] for seg in segments]
```