RussellSB / tt-vae-gan

Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KeyError: 2

dvpg0 opened this issue · comments

commented

Hi,

Great work on this, I am trying to replicate it on my local machine but I am having some issue when training the model, could you please advise what might cause this error?

Traceback (most recent call last):
File "..\tt-vae-gan\voice_conversion\src\train.py", line 275, in
train_global()
File "..\tt-vae-gan\voice_conversion\src\train.py", line 255, in train_global
losses = train_local(i, epoch, batch, pair[0], pair[1], losses)
File "..\tt-vae-gan\voice_conversion\src\train.py", line 139, in train_local
X1 = Variable(batch[id_1].type(Tensor))
KeyError: 2

I have tried to play with n_epochs but it seems to fail at the very first one as shown below:

Namespace(epoch=0, n_epochs=2, model_name='test_1', dataset='../data/data_flickr', n_spkrs=4, batch_size=4, lr=0.0001, b1=0.5, b2=0.999, decay_epoch=1, n_cpu=6, img_height=128, img_width=128, channels=1, plot_interval=1, checkpoint_interval=2, n_downsample=2, dim=32)
..\ttvaegan\lib\site-packages\torch\optim\adam.py:48: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information

I am running this with an NVIDIA RTX 3000 with 6gb dedicated. Can it be an hardware limitation? It does fail exactly when the GPU reaches around 6gb

Best,

How many n_spkrs are you actually using? My suspicion is that you are using 2 when it is set to 4 by default at the moment (probably cause I was playing around with more experiments in a many-to-many context at a point). Because of that I think it is trying to access a speaker index that doesn't exist.

I updated the readme to override the default. Try override the default value from the command by adding --n_spkrs 2