JingLi513 / Audio2Gestures

Audio2Motion Official implementation for Audio2Motion: Generating Diverse Gestures from Speech with Conditional Variational Autoencoders.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hello. Having a problem with your code. Bug in SMPL model:

armored-guitar opened this issue · comments

File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 171, in init
self.smpl_model = SMPLXModel(args.smpl_path)
File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 44, in init
meta_betas, data_struct["shapedirs"]
File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/utils.py", line 659, in blend_shapes
blend_shape = torch.einsum("bl,mkl->bmk", [betas, shape_disps])
File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 297, in einsum
return einsum(equation, *_operands)
File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 299, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [1, 20]->[1, 1, 1, 20] [10475, 3, 400]->[1, 10475, 3, 400]

Can you please provide a .pkl file. Or at least some kind of readme

as far as I understand: betas shouldn't be zeros

I will provide a README in the next few days. The problem is that you are using a newer version of the SMPL model, whose meta_betas shape should be 400 instead of 20.

Yeah. But I tried to download elder version but it had 300 betas. And can you please also provide code to learn using speech2gesture dataset

@JingLi513 can you please also provide description how you move from speech2gesture skeleton to smplx model?

File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 171, in init self.smpl_model = SMPLXModel(args.smpl_path) File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 44, in init meta_betas, data_struct["shapedirs"] File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/utils.py", line 659, in blend_shapes blend_shape = torch.einsum("bl,mkl->bmk", [betas, shape_disps]) File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 297, in einsum return einsum(equation, *_operands) File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 299, in einsum return _VF.einsum(equation, operands) # type: ignore[attr-defined] RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [1, 20]->[1, 1, 1, 20] [10475, 3, 400]->[1, 10475, 3, 400]

Can you please provide a .pkl file. Or at least some kind of readme

Hello, there is neither the speech_data folder nor the.h5 file in the project I downloaded. How did you get it?

I train it on other dataset

I train it on other dataset

Hello, I want to know how the audio data is stored in the .h5 file, the fbx2hdf.py provided by the author does not seem to have the audio data stored in it.

File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 171, in init self.smpl_model = SMPLXModel(args.smpl_path) File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/model.py", line 44, in init meta_betas, data_struct["shapedirs"] File "/Users/alexandrrezanov/inworld/gestures/audio2gestures/utils.py", line 659, in blend_shapes blend_shape = torch.einsum("bl,mkl->bmk", [betas, shape_disps]) File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 297, in einsum return einsum(equation, *_operands) File "/Users/alexandrrezanov/opt/anaconda3/envs/audio2gesture3.7/lib/python3.7/site-packages/torch/functional.py", line 299, in einsum return _VF.einsum(equation, operands) # type: ignore[attr-defined] RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [1, 20]->[1, 1, 1, 20] [10475, 3, 400]->[1, 10475, 3, 400]

Can you please provide a .pkl file. Or at least some kind of readme

Hello, I have encountered the same problem. Have you solved it? can you help me?

Hi, is this solved and how?

I solved it by updating the file from SMPLX_NEUTRAL.npz(108.8MB) to SMPLX_NEUTRAL_2020.npz (167.3MB)