nik-fedorov / dla_hw2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Speaker separation pipeline

Report

All details about this homework can be found in wandb report.

Description

This is a repository containing a convenient pipeline for training speaker separation models.

Advantages of this repo:

  • possibility of changing experimental configuration by only tuning one json file
  • good and clean code structure (see ss folder with all elements of pipeline)
  • prepared scripts for training and evaluation of models
  • prepared downloadable checkpoint

Installation guide

To set up the environment for this repository run the following command in your terminal (with your virtual environment activated):

pip install -r ./requirements.txt

Evaluate model

To download my best checkpoint run the following:

python default_test_model/download_best_ckpt.py

if you are interested how I got this checkpoint, you can read about that in wandb report.

You can evaluate model using test.py script. Here is an example of command to run my best checkpoint with default test config:

python test.py \
  -c default_test_model/config.json \
  -r default_test_model/checkpoint.pth \
  -t test_data \
  -o output_dir

After that command audio files with separated speech and file metrics.json with metrics will be in output_dir.

Training

Use train.py for training. Example of command to launch training from scratch:

python train.py -c hw_asr/configs/config_librispeech.json

To fine-tune your checkpoint you can use option -r to pass path to the checkpoint file:

python train.py \
  -c hw_asr/configs/config_librispeech.json \
  -r saved/models/<exp name>/<run name>/checkpoint.pth

About

License:MIT License


Languages

Language:Python 100.0%