RuntimeError: Given groups=1, weight of size [64, 1, 3, 3], expected input[8, 3, 256, 256] to have 1 channels, but got 3 channels instead
catherineyeh opened this issue · comments
Hi! I'm running into a similar trouble as this issue: #156 .
I am trying to train a psp encoder for the super resolution task with my own dataset (256x256, in greyscale)
Here are my parameters:
{'batch_size': 8,
'board_interval': 50,
'checkpoint_path': None,
'dataset_type': 'mydataset_type',
'encoder_type': 'GradualStyleEncoder',
'exp_dir': 'exp',
'id_lambda': 0.0,
'image_interval': 100,
'input_nc': 1,
'l2_lambda': 1.0,
'l2_lambda_crop': 0,
'label_nc': 1,
'learn_in_w': False,
'learning_rate': 0.0001,
'lpips_lambda': 0.8,
'lpips_lambda_crop': 0,
'max_steps': 500000,
'moco_lambda': 0,
'optim_name': 'ranger',
'output_size': 1024,
'resize_factors': '1,2,4,8',
'save_interval': 5000,
'start_from_latent_avg': True,
'stylegan_weights': 'pretrained_models/exported25000pkl.pt',
'test_batch_size': 8,
'test_workers': 8,
'train_decoder': False,
'val_interval': 2500,
'w_norm_lambda': 0.005,
'workers': 8}
After it tries to load my custom dataset, I got the Traceback:
Traceback (most recent call last):
File "scripts/train.py", line 32, in <module>
main()
File "scripts/train.py", line 28, in main
coach.train()
File "./training/coach.py", line 83, in train
y_hat, latent = self.net.forward(x, return_latents=True)
File "./models/psp.py", line 92, in forward
codes = self.encoder(x)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "./models/encoders/psp_encoders.py", line 91, in forward
x = self.input_layer(x)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/cyeh/psp/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 1, 3, 3], expected input[8, 3, 256, 256] to have 1 channels, but got 3 channels instead
I specified input_nc to be 1 since I am using greyscale images as inputs, but based on the error message it seems it is receiving input of size [8, 3, 256, 256]. Not sure what is the root cause of this.
It seems like your parameters are correct for grayscale images.
Could you try running in debug and putting a breakpoint here:
pixel2style2pixel/training/coach.py
Line 91 in 334f45e
Let's see what the dimension of
x
is to see if there is something with the dataset that is not working as expected.Thanks for the reply!
This was the dimension of x
torch.Size([8, 3, 256, 256])
It seems like the dataset has 3 channels... but upon checking the image information it looks like it is in greyscale...
I'll try with the rgb settings and see if any issue arises. Thanks!
Are the parameters for rgb 'input_nc': 3
?
I wouldn't give up just yet on training with grayscale images.
Try putting a breakpoint in the __getitem__
function of the dataset and how PIL is reading the images.
pixel2style2pixel/datasets/images_dataset.py
Lines 18 to 33 in 334f45e
I think i see the problem actually in the following line:
It is converting the
to_im
to RGB. You could try doing something like:
to_im = Image.open(to_path)
to_im = to_im.convert('RGB') if self.opts.label_nc == 0 else to_im.convert('L')
Thanks! your suggestion + modifying the transformations solved the problem! Now I'm getting an error:
Traceback (most recent call last):
File "scripts/train.py", line 32, in <module>
main()
File "scripts/train.py", line 28, in main
coach.train()
File "./training/coach.py", line 85, in train
y_hat, latent = self.net.forward(x, return_latents=True)
File "./models/psp.py", line 98, in forward
codes = codes + self.latent_avg.repeat(codes.shape[0], 1, 1)
RuntimeError: The size of tensor a (14) must match the size of tensor b (18) at non-singleton dimension 1
I tried removing the "--start_from_latent_avg" flag, but then got:
./training/ranger.py:123: UserWarning: This overload of addcmul_ is deprecated:
addcmul_(Number value, Tensor tensor1, Tensor tensor2)
Consider using one of the following signatures instead:
addcmul_(Tensor tensor1, Tensor tensor2, *, Number value) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)
exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
Traceback (most recent call last):
File "/home/cyeh/psp/lib/python3.6/site-packages/PIL/Image.py", line 2680, in fromarray
mode, rawmode = _fromarray_typemap[typekey]
KeyError: ((1, 1, 1), '|u1')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "scripts/train.py", line 32, in <module>
main()
File "scripts/train.py", line 28, in main
coach.train()
File "./training/coach.py", line 94, in train
self.parse_and_log_images(id_logs, x, y, y_hat, title='images/train/faces')
File "./training/coach.py", line 241, in parse_and_log_images
'target_face': common.tensor2im(y[i]),
File "./utils/common.py", line 23, in tensor2im
return Image.fromarray(var.astype('uint8'))
File "/home/cyeh/psp/lib/python3.6/site-packages/PIL/Image.py", line 2682, in fromarray
raise TypeError("Cannot handle this data type: %s, %s" % typekey)
TypeError: Cannot handle this data type: (1, 1, 1), |u1
Regarding the first error,
RuntimeError: The size of tensor a (14) must match the size of tensor b (18) at non-singleton dimension 1
This seems to be caused because you are using a generator that has 14 latent codes, but you specified an output size of 1024. Did you maybe mean to set --output_size=256
?
Regarding the other errors related to PIL, these are pretty common issues. My best advice is to run your code in debug, see where it falls, and try to Google the errors to find the correct fix. The issues are most likely caused due to your attempt to adopt the code to work with grayscale images. Nevertheless, the solution should be quite simple to find.
Adjusting the output_size resolved the issue, thanks a lot!
I'll close this issue for now.