dataset.py : codes.size(-1)= 48960, sum(duration)=38678 assert error:
lileishitou opened this issue · comments
dataset.py check:
assert abs(codes.size(-1) - sum(duration)) < 3, (codes.size(-1), sum(duration), filename)
assert abs(audio.shape[1]-lmin * self.hop_length) < 3 * self.hop_length
why to check the encode and duration?
The error may be caused by false alignment. Please check the textgrid file that "sp", "spn", "sil" are not empty or "". Duration and spec length should be matched so that model can converge.
I found some textGrid files does not have sil(sil, sp, spn) , but other files have . I used mfa tool and use the token "english_us_arpa english_us_arpa" as model. why the generated TextGrid files different?
I have added a check for empty silent phones. Update to the latest code, and reprocessed the dataset to see if there are any remaining issues. Hope this can help you.
tks, that helps a lot.