fatchord / WaveRNN

WaveRNN Vocoder + TTS

Home Page:https://fatchord.github.io/model_outputs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about upsample network

kkp15 opened this issue · comments

Hello, I've been dabbling with your version of wavernn.
One thing I was curious about was the padding scheme you've done on up_layers as below.
Could you explain why you padded only at the right side?
Thank you.

   self.up_layers = nn.ModuleList()
    for scale in upsample_scales:
        k_size = (1, scale * 2 + 1)
        padding = (0, scale)
        stretch = Stretch2d(scale, 1)
        conv = nn.Conv2d(1, 1, kernel_size=k_size, padding=padding, bias=False)
        conv.weight.data.fill_(1. / k_size[1])
        self.up_layers.append(stretch)
        self.up_layers.append(conv)

@kkp15 Hi there, it's a 2d convolution and the way pytorch implements it (or did back when I wrote that code) is the first part of the padding tuple is for top and bottom while the 'scale' is for both left and right. Hope that clears it up.