williamyang1991 / VToonify

[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

out = torch.cat([f_G, abs(f_G-f_E)], dim=1) RuntimeError: The size of tensor a (126) must match the size of tensor b (125) at non-singleton dimension 3

yaohwang opened this issue · comments

Traceback (most recent call last):
File "/VToonify/style_transfer.py", line 226, in
y_tilde = vtoonify(inputs, s_w.repeat(inputs.size(0), 1, 1), d_s = args.style_degree)
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/VToonify/model/vtoonify.py", line 258, in forward
out, m_E = self.fusion_out[fusion_index](out, f_E, d_s)
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/VToonify/model/vtoonify.py", line 125, in forward
out = torch.cat([f_G, abs(f_G-f_E)], dim=1)
RuntimeError: The size of tensor a (126) must match the size of tensor b (125) at non-singleton dimension 3

make sure the width and the height of your input image are divisible by 8.
If still having the problem, make sure the width and the height are divisible by 16.

that make sense, but I'm using the example code and image. so I think it should be work.

python style_transfer.py --content ./data/038648.jpg
--scale_image --backbone toonify
--ckpt ./checkpoint/vtoonify_t_arcane/vtoonify.pt
--padding 600 600 600 600 # use large padding to avoid cropping the image

I fix it with a little change, maybe not the best way. anyway I can make a pull-request if it's ok.

It is confusing because I have the following codes to make sure the image size to be divisible by 8.
And the example code has no issue from my side.

VToonify/util.py

Lines 184 to 187 in cf993aa

left = max(round(center[0] - padding[0]), 0) // 8 * 8
right = min(round(center[0] + padding[1]), w) // 8 * 8
top = max(round(center[1] - padding[2]), 0) // 8 * 8
bottom = min(round(center[1] + padding[3]), h) // 8 * 8

yeah, that's sure, so I've been confused too.

it's been ok to run the following example code and image

python style_transfer.py --content ./data/038648.jpg
--scale_image --style_id 77 --style_degree 0.5
--ckpt ./checkpoint/vtoonify_d_arcane/vtoonify_s_d.pt
--padding 600 600 600 600 # use large padding to avoid cropping the image

but with the upper one, just don't work as expect.

I'll dig it deep later, and find why.

I made a mistake, which is actually running

python style_transfer.py

with the default params, image ./data/077436.jpg;
but I think it's better to make it applicable to this kind of situation, 'image size to be divisible by 8' is really a hard limit.