Usage with kaldi-gstreamer-server and Estonian model
# go to the sherpa root directory
python ./sherpa/bin/streaming_server.py \
--port=6006 \
--encoder-model="path/to/icefall_pruned_transducer_stateless7_streaming_et/encoder_jit_trace.pt" \
--decoder-model="path/to/icefall_pruned_transducer_stateless7_streaming_et/decoder_jit_trace.pt" \
--joiner-model="path/to/icefall_pruned_transducer_stateless7_streaming_et/joiner_jit_trace.pt" \
--tokens="path/to/icefall_pruned_transducer_stateless7_streaming_et/data/lang_bpe_1000/tokens.txt"
# go to the kaldi-gstreamer-server root directory
python kaldigstserver/client.py -r=32000 test/data/lause2.raw
sherpa
is an open-source speech-text-text inference framework using
PyTorch, focusing exclusively on end-to-end (E2E) models,
namely transducer- and CTC-based models. It provides both C++ and Python APIs.
This project focuses on deployment, i.e., using pre-trained models to transcribe speech. If you are interested in how to train or fine-tune your own models, please refer to icefall.
We also have other similar projects that don't depend on PyTorch:
sherpa-onnx
andsherpa-ncnn
also support iOS, Android and embedded systems.
Please refer to the documentation at https://k2-fsa.github.io/sherpa/
Try sherpa
from within your browser without installing anything:
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition