What the best practises of using this lib for stt?
Alex-Kopylov opened this issue · comments
Alex commented
I have zero experience in making STT models so please, advise me.
I'm using your open_stt (thanks!) with SeanNaren/deepspeech.pytorch for building STT model. So as you know, I must provide labels for training.
What the intuition behind using string.punctuation and uppercase or lowercase at the same time? Should I provide this(below) as labels or left only space and chars (e.g. lowercase)?
# punctuation + space + rus
self.tgt_vocab = {token: i+5 for i, token in enumerate(punctuation + rus_letters + ' ' + '«»—')}