Inconsistent samples for multiple targets in SoundDataset
ilya16 opened this issue · comments
When audio lengths are greater than max_length
and multiple target sample rates are used, the SoundDataset
samples audios with different start positions:
audiolm-pytorch/audiolm_pytorch/data.py
Lines 86 to 97 in c65bb97
Affects the training data for CoarseTransformer
.
@ilya16 yes indeed that does not seem right 😞
decided to take the strategy of doing all the resampling + curtail / pad on the highest target sample freq first, before resampling to all the rest of the target sample freqs
want to see if that addresses the issue?
@lucidrains looks good!