For line 675 of soundstorm.py.
0417keito opened this issue · comments
Why is this?
Shouldn't this part be following ?
all_mask_num_tokens = all_mask_num_tokens if q < num_full_sampling_levels else torch.zeros((1, batch_size), dtype = torch.long, device = device)
Thank you very much for pointing out this issue and I have made the necessary corrections. This mistake seems resulting in the audio quality deteriorating as the number of iterations increases. I really appreciate your feedback!