how to set asr model trained from zero, tokenizer is 4257_unigram.model
ChasingStar95 opened this issue · comments
part of setting as followed, i don't know how to find tokenizer.ckpt because tokenizer format is of *.model :
pretrained_path: speechbrain/asr-crdnn-rnnlm-librispeech
pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer
collect_in: !ref <save_folder>
loadables:
lm: !ref <lm_model>
tokenizer: !ref
model: !ref
paths:
lm: !ref <pretrained_path>/lm.ckpt
tokenizer: !ref <pretrained_path>/tokenizer.ckpt
model: !ref <pretrained_path>/asr.ckpt
If you use the paths
argument, you can provide any filepath, so you can just change
tokenizer: !ref <pretrained_path>/tokenizer.ckpt
to tokenizer: your/path/to/4257_unigram.model
If you use the
paths
argument, you can provide any filepath, so you can just changetokenizer: !ref <pretrained_path>/tokenizer.ckpt
totokenizer: your/path/to/4257_unigram.model
if test loss of lm is 1.5 or closed to that, how can i do to fix this problem?
I don't understand, could you elaborate? Why do mention the LM test loss here? Or do you mean that you have a separate issue about your LM training? Please use a separate issue for that and provide some more context.
I don't understand, could you elaborate? Why do mention the LM test loss here? Or do you mean that you have a separate issue about your LM training? Please use a separate issue for that and provide some more context.
ok, thank you