ZhangXInFD / soundstorm-speechtokenizer

Implementation of SoundStorm built upon SpeechTokenizer.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

For line 675 of soundstorm.py.

0417keito opened this issue · comments

https://github.com/ZhangXInFD/soundstorm-speechtokenizer/blob/main/soundstorm_speechtokenizer/soundstorm.py#L675C8

Why is this?
Shouldn't this part be following ?

all_mask_num_tokens = all_mask_num_tokens if q < num_full_sampling_levels else torch.zeros((1, batch_size), dtype = torch.long, device = device)

Thank you very much for pointing out this issue and I have made the necessary corrections. This mistake seems resulting in the audio quality deteriorating as the number of iterations increases. I really appreciate your feedback!