idnavid / spkr_diarization

Alveo Speaker Diarization project - Navid's version

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Speaker Diarization component for Alveo project. The role of the diarization tool is to segment long speech files into smaller chunks. The output labels will be used as a benchmark for human transcribers.

Navid Shokouhi July 2017

Packages:

  • Spro (for feature extraction)
  • AudioSeg (Diarization binaries)
  • Python:
    • numpy, scipy

Installation guide:

  • Installing Spro 4.0
    cd spro-directory
    ./configure
    make 
    make install 
    

NOTE: when installing on Mac, use Spro 5.0

  • Installing AudioSeg:
    cd audioseg-directory
    ./configure --with-spro=[path-to-spro-directory]
    make
    make install
    

diar.py

main module is diar.py, which contains an example script. To load in python, use diar.diarization(root_dir,wavname,ubmname,out_dir), where:

  root_dir: root directory of experiment. 
  wavname: full path to wave file on disk. 
  ubmname: full path to pretrained UBM on disk. 
  out_dir: output directory, to store intermediate files. 

Examples:

A working example for for Austalk data is available in experiments.

About

Alveo Speaker Diarization project - Navid's version


Languages

Language:Python 93.6%Language:Shell 6.4%