yanchaomars / jsalt2019-diadet

Repository of recipes for the JSALT2019 workshop on "Speaker Detection in Adverse Scenarios with a Single Microphone"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

jsalt2019-diadet

Repository of recipes for the JSALT2019 workshop on "Speaker Detection in Adverse Scenarios with a Single Microphone"

Cloning the repo

  • To clone the repo execute
git clone --recursive https://github.com/jsalt2019-diadet/jsalt2019-diadet.git
  • The recursive option downloads some dependencies:

    • hyperion: python code for spk detection back-end
  • If you want to update the sumodules to the last commit, run

cd jsalt2019-diadet
git submodule sync
git submodule update --init --recursive --remote
  • Dependencies are downloaded in
jsalt2019-diadet/tools

Other Dependencies:

  • The recipes also depend on Anaconda3.5, Kaldi, cuDNN, etc.

  • Recommended: use some preinstalled versions of the dependencies in the grid to avoid each person having its own.

    • To create links to preinstalled kaldi, anaconda and cudnn, run:
    cd jsalt2019-diadet/
    ./make_clsp_links.sh
    • The anaconda that you will link with this has several environments:
      • base: numpy, h5py, pandas, etc.
    • tensorflow1.8g_cpu: tensorflow 1.8 for cpu
    • tensorflow1.8g_gpu: tensorflow 1.8 for gpu
    • pytorch1.0_cuda9.0: pytorch 1.0 with cuda 9.0
    • pyannote: python3.6 with pyannote-metrics installed.
  • Anaconda3.5:

    • Make a link to your anaconda installation in the tools directory:
    cd jsalt2019-diadet/tools/anaconda
    ln -s <your-anaconda-3.5> anaconda3.5
    • or follow instructions in jsalt2019-diadet/tools/anaconda/full_install.sh to install anaconda from scratch
  • Kaldi speech recognition toolkit

    • Make link to an existing kaldi installation
    cd jsalt2019-diadet/tools/kaldi
    ln -s <your-kaldi> kaldi
    • or follow instructions in jsalt2019-diadet/tools/anaconda/install_kaldi.sh to install kaldi from scratch
  • CuDNN: tensorflow and pytorch will need some version of cudnn

    • Make a link to some existing cudnn version that matches the requirements of your tf or pytorch, e.g.:
    cd jsalt2019-diadet/tools/cudnn
    #cudnn v7.4 for cuda 9.0 needed by pytorch 1.0 
    ln -s /home/janto/usr/local/cudnn-9.0-v7.4 cudnn-9.0-v7.4

Directory structure:

  • The directory structure of the repo looks like this:
./jsalt2019-diadet
./jsalt2019-diadet/tools
./jsalt2019-diadet/tools/anaconda
./jsalt2019-diadet/tools/anaconda/anaconda3
./jsalt2019-diadet/tools/cudnn
./jsalt2019-diadet/tools/cudnn/cudnn-9.0-v7.4
./jsalt2019-diadet/tools/kaldi
./jsalt2019-diadet/tools/kaldi/kaldi
./jsalt2019-diadet/tools/hyperion
./jsalt2019-diadet/tools/hyperion/hyperion
./jsalt2019-diadet/tools/speech_denoising_tools
./jsalt2019-diadet/egs
./jsalt2019-diadet/egs/jsalt2019-diadet
./jsalt2019-diadet/egs/jsalt2019-diadet/v1
./jsalt2019-diadet/src
  • Directories:
    • tools: contains external repos and tools like kaldi, python, pyannotate, hyperion, cudnn, etc.
    • src: it can be used to place code that we create specifically for this repo.
      • src/kaldi_augmentation: some scripts to perform data augmentation using the wav-reverberate kaldi tool
    • egs: contains the recipes
      • egs/jsalt2019-diadet: recipe for speaker diarization/detection/tracking for all datasets that we use in the workshop.
        • v1: Version 1 is based on kaldi x-vectors
      • egs/sitw_noisy: recipe for SITW with added noise and reverberation in the dev/eval test. Used to measure performance of enhancement methods at different noise types, noise levels, RT60 reveration times.
        • v1: Based on kaldi x-vectors.

About

Repository of recipes for the JSALT2019 workshop on "Speaker Detection in Adverse Scenarios with a Single Microphone"

License:Apache License 2.0


Languages

Language:Shell 62.6%Language:Python 33.2%Language:Perl 4.1%Language:Awk 0.0%