codes concatenation
Beinabih opened this issue · comments
Hi,
Thank you for your code. It has been very helpful to me in writing my own trainer.
I think the line
codes = torch.cat([audio, audio], dim=0)
. (here)
should be
codes = torch.cat([codes, codes], dim=0)
otherwise you wont use your encoded codebooks, right? :)
kind regards,
Jonas
um... no...
The codebook is obtained via the preprocess_audio
that takes both the model and audio waveform as inputs. It then encodes them with encodec (here called compression_model) and then returns them.
Here is where we call that function with model and audio waveform (tensor) as arguments and we get the codebook in return
Line 139 in 5fd56f0
Note that the codebook here is just one batch, we then concate it to match the shape of the condition_tensors.
Feel free to correct me if I'm wrong, because the trainer does not work for anything outside overfitting at the moment.
Ah sry for the confusion, your totally right...
I moved the model.compression_model.encode(wav)
out of the
process_audio
function since i am using the Audiocraft Dataloader. I think i hallucinated some of my code in your trainer.py