AkojimaSLP / Neural-mask-estimation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset

dreamibor opened this issue · comments

Hi, is there a way to create the training dataset? I mean the approach that you take to get seperate speech and noise data?

Hi, I'd like to appreciate your question.

  1. Way to create training data
    Training data is generated by choosing from ./dataset/train/noise/ and ./dataset/train/speech/* respectively. The 2 audio is simulated by chosen SNR and revereberent time randomly. In script "train.py", the simulated speech is generated without writing file in HDD(The more training data file, HDD disc capacity is insufficient).

  2. Separete speech and noise data
    As you know, this approach needs parallel corpus(noise and speech). Research often uses CHiME corpus.

Regards,

Thank you for your response! I think your answer solved my problem and I will close the issue.