How to split mel-spectrum in Multi-band wavernn?

Question

How to split mel-spectrum in Multi-band wavernn?

Shijie-Liu007 opened this issue 4 years ago · comments

Hey all!

I want to realize multi-band wavernn based on fatchord version. During the training process, I split the audio samples to 4 subbands by using an analysis filter, but how to split mel-spectrum so that it corresponds to audio subbands? Can I divide it into four parts in direct order? Opinions and ideas about multi-band wavernn would be greatly appreciated!

Reference: DurIAN: Duration Informed Attention Network For Multimodal Synthesis https://arxiv.org/abs/1909.01700#:~:text=The%20proposed%20Multiband%20WaveRNN%20effectively%20reduces%20the%20total,end-to-end%20systems%2C%20while%20at%20the%20same%20time%20

z.q.mao · Answer 1 · Mon Dec 21 2020 11:34:54 GMT+0800 (China Standard Time)

@Shijie-Liu007 我觉得 mel 不需要进行分割吧，以前是一帧对应 256个音频点，变成4band后，一帧对应64个点就可以了，这是我的理解，mel谱还是以前的输入，只不过上采样的幅度变成原来的1/4而已

Shijie-Liu007 · Answer 2 · Mon Dec 21 2020 13:49:30 GMT+0800 (China Standard Time)

@Shijie-Liu007 我觉得 mel 不需要进行分割吧，以前是一帧对应 256个音频点，变成4band后，一帧对应64个点就可以了，这是我的理解，mel谱还是以前的输入，只不过上采样的幅度变成原来的1/4而已

非常感谢回复！我先按照您的说法做尝试，再次感谢！