wav2letter-ctc-pytorch

Wave2Letter (paper) with a waveform input.

The model was trained on LibriSpeech-960. In training, BatchNorm and Dropout were used, which can be fused into the weights to make them compatible with the Wave2Letter from torchaudio.models.

Pretrained weights

for model.Wav2Letter (link)

for torchaudio.models.Wav2Letter (link)

Greedy decoding

dataset	CER	WER
dev-clean	0.111	0.331
test-clean	0.105	0.318

Example

from torchaudio.models import Wav2Letter
model = Wav2Letter(num_classes=len(labels)).cuda()
model.load_state_dict(torch.load('./pretrained/states_fused.pth'))

Some filter kernels from the first Conv1d layer

About

wav2letter asr pytorch

Languages

Language:Python 100.0%