nipponjo / wav2letter-ctc-pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

wav2letter-ctc-pytorch

Wave2Letter (paper) with a waveform input.

The model was trained on LibriSpeech-960. In training, BatchNorm and Dropout were used, which can be fused into the weights to make them compatible with the Wave2Letter from torchaudio.models.

Pretrained weights

for model.Wav2Letter (link)

for torchaudio.models.Wav2Letter (link)

Greedy decoding

dataset CER WER
dev-clean 0.111 0.331
test-clean 0.105 0.318

Example

from torchaudio.models import Wav2Letter
model = Wav2Letter(num_classes=len(labels)).cuda()
model.load_state_dict(torch.load('./pretrained/states_fused.pth'))

Some filter kernels from the first Conv1d layer

About


Languages

Language:Python 100.0%