ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Home Page:https://0nutation.github.io/SpeechTokenizer.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

what is the input when inference for encoding?

Edwardmark opened this issue · comments

what is the input when inference for encoding? I think only raw audio is the input, no stft or mel spectrum is needed for inference, is that right?

Yes, it is right。

@ZhangXInFD Thanks for your quick and helpful reply. Your work is really great!