WenwanChen / Random-delay-LibriMix

Create a new overlapped Speech dataset based on LibriSpeech

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Random-delay-LibriMix

Edit by zhyin in Dec23,2020

  • Description : Create a new overlapped Speech dataset based on LibriSpeech
  • Principle: image
  1. Set the loudness of each voice. A linear transformation was used in this progress.

  2. Set the delay between voice randomly.The min_delay is 0.5sec, and the max_delay is length of the last voice. The mixture progress of three voice can be expressed as:

                ---------------------               [voice1]    
                |<-0.5sec->|-------------           [voice2]    
                |<-- max_delay -->|---------------- [voice3]     
    
  • Use:
    • [1] Description
      Create a metadate for the subset of LibriSpeech.
      [speaker_ID]+[sex]+[subset]+[length]+[origin_path]

             python create_librispeech_metadata.py  --librispeech_dir 
      
    • Example

      python create_librispeech_metadata.py  --librispeech_dir D:OSR\2020-12\LibriMix\scripts\LibriSpeech
      
    • Result

       D:\OSR\2020-12\LibriMix\scripts\LibriSpeech\metadata\test-clean.csv
      
    • [2] Description
      Create a metadata for the overlapped speech
      miture-[mixture_ID]+[source_x_path]+[source_x_gain] for x in range(n_src) info-[mixture_ID]+[speaker_x_ID]+[speaker_x_sex] for x in range(n_src)

       python create_librimix_metadata.py --librispeech_dir --metadata_dir --n_src
      
    • Example:

      python create_librimix_metadata.py --librispeech_dir D:\OSR\2020-12\LibriMix\scripts\LibriSpeech --librispeech_md_dir D:\OSR\2020-12\LibriMix\scripts\LibriSpeech\metadata --n_src 3
      
    • Result:

      D:\OSR\2020-12\LibriMix\scripts\Libri3Mix_metadata\libri3mix_test-clean.csv
      D:\OSR\2020-12\LibriMix\scripts\Libri3Mix_metadata\libri3mix_test-clean_info.csv
      
    • [3] Description
      Create a 3-Speakers overlapped speech
      Create the metadata for this overlapped speech
      [mixture_ID]+[mixture_path]+[source_x_path]+[length]+[source_x_delay] for x in range(n_src)

       python create_libri2mix_random_delay.py --librispeech_dir --metadata_dir --n_src --modes 
      
    • Example1:

      D:\OSR\2020-12\LibriMix\scripts>python create_libri2mix_random_delay.py --librispeech_dir D:\OSR\2020-12\LibriMix\scripts\LibriSpeech --metadata_dir D:\OSR\2020-12\LibriMix\scripts\Libri3Mix_metadata --n_src 2 --modes move
      
    • Example2:

      D:\OSR\2020-12\LibriMix\scripts>python create_libri2mix_random_delay.py --librispeech_dir D:\OSR\2020-12\LibriMix\scripts\LibriSpeech --metadata_dir D:\OSR\2020-12\LibriMix\scripts\Libri3Mix_metadata --n_src 3 --modes move
      
    • Result:

      D:\OSR\2020-12\LibriMix\scripts\Libri2Mix-randomdelay\wav16k\move\test\mix_clean\mixture_ID.wav
      D:\OSR\2020-12\LibriMix\scripts\Libri2Mix-randomdelay\wav16k\move\metadata\mixture_test_mix_clean.csv
      

About

Create a new overlapped Speech dataset based on LibriSpeech


Languages

Language:Python 47.0%Language:HTML 46.5%Language:Jupyter Notebook 6.5%