'nn.init.xavier_uniform is deprecated' and 'size mismatch, m1: [128 x 384], m2: [618 x 512]'
FrankPSch opened this issue · comments
Hi there,
a very readable and instructive contribution, thanks for sharing! In which environment does this run sucessfully?
I am trying to run 'GTC_2018_Lab' and 'GTC_2018_Lab-solutions' with the version below, both result in the following two errors.
My environment is Win10 with Anaconda. To solve it, I tried with USE_CUDA=True / False and Python 3.6 / 3.7 and Cudatoolkit 10 / Cudatoolkit 9. Then I downgraded from pytorch 1.0.1 using conda install pytorch=0.4.1 cuda90 -c pytorch
. I did not find earlier Pytorch versions via Conda or compiled for Windows on the Pytorch Website.
Also, I changed the shape of the first and last layer to match for the second error but without a change.
Thank you!
First Error ( I deleted some lines in the middle):
[LOG 20190329-05:48:27] decoder architecture:
decoder(
(decoder_L1): Linear(in_features=3, out_features=4, bias=True)
(decoder_R1): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L2): Linear(in_features=4, out_features=8, bias=True)
(decoder_R2): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L3): Linear(in_features=8, out_features=16, bias=True)
(decoder_R3): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L4): Linear(in_features=16, out_features=32, bias=True)
(decoder_R4): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L5): Linear(in_features=32, out_features=64, bias=True)
(decoder_R5): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L6): Linear(in_features=64, out_features=128, bias=True)
(decoder_R6): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L7): Linear(in_features=128, out_features=256, bias=True)
(decoder_R7): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L8): Linear(in_features=256, out_features=512, bias=True)
(decoder_R8): LeakyReLU(negative_slope=0.4, inplace)
(decoder_L9): Linear(in_features=512, out_features=618, bias=True)
(decoder_R9): LeakyReLU(negative_slope=0.4, inplace)
(dropout): Dropout(p=0.0, inplace)
)
C:\Users\frank\.conda\envs\py36_cuda9\lib\site-packages\ipykernel_launcher.py:10: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
# Remove the CWD from sys.path while we load stuff.
C:\Users\frank\.conda\envs\py36_cuda9\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\frank\.conda\envs\py36_cuda9\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\frank\.conda\envs\py36_cuda9\lib\site-packages\ipykernel_launcher.py:25: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
...
C:\Users\frank\.conda\envs\py36torch\lib\site-packages\ipykernel_launcher.py:46: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\frank\.conda\envs\py36torch\lib\site-packages\ipykernel_launcher.py:51: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
Second error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-32-7d9371378df5> in <module>
37
38 # run forward pass
---> 39 z_representation = encoder_train(mini_batch_torch) # encode mini-batch data
40 mini_batch_reconstruction = decoder_train(z_representation) # decode mini-batch data
41
~\.conda\envs\py36_cuda9\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)
<ipython-input-23-0afc78290eab> in forward(self, x)
57
58 # define forward pass through the network
---> 59 x = self.encoder_R1(self.dropout(self.encoder_L1(x)))
60 x = self.encoder_R2(self.dropout(self.encoder_L2(x)))
61 x = self.encoder_R3(self.dropout(self.encoder_L3(x)))
~\.conda\envs\py36_cuda9\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)
~\.conda\envs\py36_cuda9\lib\site-packages\torch\nn\modules\linear.py in forward(self, input)
65 @weak_script_method
66 def forward(self, input):
---> 67 return F.linear(input, self.weight, self.bias)
68
69 def extra_repr(self):
~\.conda\envs\py36_cuda9\lib\site-packages\torch\nn\functional.py in linear(input, weight, bias)
1350 if input.dim() == 2 and bias is not None:
1351 # fused op is marginally faster
-> 1352 ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t())
1353 else:
1354 output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [128 x 384], m2: [618 x 512] at c:\a\w\1\s\tmp_conda_3.6_104352\conda\conda-bld\pytorch_1550400396997\work\aten\src\thc\generic/THCTensorMathBlas.cu:266
The contributor of this repository has solved my error messages. Here are the causes and the resolution:
(1) The Pytorch program library has evolved since the creation of the Lab and some of the function names are depricated. This resulted in the warning regarding the Xavier initialization of the network parameters ("UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_"). To solve this, the lab was migrated to the current Pytorch version: 1.0.1.post2 (Python version: 3.7.0).
(2) Together with NVIDIA, the Lab was designed in such a way that the tasks of Chapter 3 could be successfully solved first (i.e. the additional one-hot coding of the two booking attributes "WAERS" and "BUKRS") before the network training could be started. If these two attributes are not encoded accordingly, the dimensionality of the input vectors corresponds to only 384 dimensions and not the 618 dimensions required by the network architecture. To solve this, the dimensonality of the input vector was defined dynamically within the network architecture definition as a function of the dimensionality of the data set.
Thanks a lot to the authors for this nice example!