khaykingleb / Automatic-Speech-Recognition

QuartzNet and Deepspeech Implementation for ASR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatic Speech Recognition

Implementation of models for the Automatic Speech Recognition problem.

  1. QuartzNet with (BxS)xR architecture:

  1. Deepspeech:

Notebook

Open In Colab

Getting Started

Clone the repository and step into it:

git clone https://github.com/khaykingleb/ASR.git
cd ASR

Install requirements and modules.

pip install -r requirements.txt
python setup.py install

Use for training:

python train.py -c configs/cofig_name.json

Use for testing:

python test.py \
      -c default_test_model/config.json \
      -r default_test_model/checkpoint.pth \
      -o result.json

Please, note that for testing the model you need to specify the dataset in test.py, for instance LibrispeechDataset:

config.config["data"] = {
        "test": {
            "batch_size": args.batch_size,
            "num_workers": args.jobs,
            "datasets": [
                {
                    "type": "LibrispeechDataset",
                    "args": {
                        "part": "test-clean"
                    }
                }
            ]
        }
    }

Data Used

About

QuartzNet and Deepspeech Implementation for ASR

License:MIT License


Languages

Language:Python 90.4%Language:Jupyter Notebook 9.6%