santi-pdp / pase

Problem Agnostic Speech Encoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Trying to load the pretained model

MittalShruti opened this issue · comments

I run the following code in google colab (Cuda 10.0)

from pase.models.frontend import wf_builder
pase = wf_builder('cfg/frontend/PASE+.cfg').eval()
pase.load_pretrained('FE_e199.ckpt', load_last=True, verbose=True)

# Now we can forward waveforms as Torch tensors
import torch
x = torch.randn(1, 1, 100000) # example with random noise to check shape
# y size will be (1, 256, 625), which are 625 frames of 256 dims each
y = pase(x)

I tried both pase(x) and pase(x.cuda())

For pase(x) I get the following error

AssertionError                            Traceback (most recent call last)
<ipython-input-11-eae0f995c36c> in <module>()
      2 x = torch.randn(1, 1, 100000) # example with random noise to check shape
      3 # y size will be (1, 256, 625), which are 625 frames of 256 dims each
----> 4 y = pase(x)

7 frames
/usr/local/lib/python3.6/dist-packages/torchqrnn/forget_mult.py in forward(self, f, x, hidden_init, use_cuda)
    173         use_cuda = use_cuda and torch.cuda.is_available()
    174         # Ensure the user is aware when ForgetMult is not GPU version as it's far faster
--> 175         if use_cuda: assert f.is_cuda and x.is_cuda, 'GPU ForgetMult with fast element-wise CUDA kernel requested but tensors not on GPU'
    176         ###
    177         # Avoiding 'RuntimeError: expected a Variable argument, but got NoneType' when hidden_init is None

AssertionError: GPU ForgetMult with fast element-wise CUDA kernel requested but tensors not on GPU

for pase(x.cuda()) I get the following error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-12-5f65b8d305f3> in <module>()
      2 x = torch.randn(1, 1, 100000) # example with random noise to check shape
      3 # y size will be (1, 256, 625), which are 625 frames of 256 dims each
----> 4 y = pase(x.cuda())

5 frames
/content/pase/pase/models/modules.py in forward(self, waveforms)
    900         band=(high-low)[:,0]
    901 
--> 902         f_times_t_low = torch.matmul(low, self.n_)
    903         f_times_t_high = torch.matmul(high, self.n_)
    904                 # Equivalent of Eq.4 of the reference paper (SPEAKER RECOGNITION FROM RAW WAVEFORM WITH SINCNET).

RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_mm

Any suggestions?

Hi Shruti,

On top of sending the input tensor to GPU (as you do with x.cuda()) you also need to move pase params to the same backend by calling pase.cuda().

As such, your code should look smth like below. Let us know if it helps.

from pase.models.frontend import wf_builder
pase = wf_builder('cfg/frontend/PASE+.cfg').eval()
pase.load_pretrained('FE_e199.ckpt', load_last=True, verbose=True)

pase.cuda()

# Now we can forward waveforms as Torch tensors
import torch
x = torch.randn(1, 1, 100000) # example with random noise to check shape
# y size will be (1, 256, 625), which are 625 frames of 256 dims each
y = pase(x.cuda(), device='cuda')

Thanks @pswietojanski This is working!