Wake Word Detector

This repository contains two wake word detector models. Each model has been trained to recognize the word "Arise" or the word "Awaken" in English.

File Organization

Here is the general organization of the most important files in this repo.

Wake_Word_Detector/
├── README.md
├── requirements.txt
├── Data/
    ├── test/
        ├── neg/
        ├── pos-arise/
        ├── pos-awaken/
    ├── train/
        ├── neg/
        ├── pos-arise/
        ├── pos-awaken/
├── models/
    ├── ARISE.pth
    ├── AWAKEN.pth
├── src/
    ├── record.py
    ├── process.py
    ├── train.py
    ├── interval_listen.py
    ├── continuous_listen.py
    ├── constants.py

The Data folder contains all the training and test data for the PyTorch models.

train.py trains and tests the PyTorch model using the data (stored in csv files) generated by process.py.

Usage

Dependencies

To fully utilize this codebase, you will need to download all the dependencies specified in the requirements.txt file.

Running Existing Model

You can try out the existing model by running the following command.

python ./src/continuous_listen.py

This will start an infinite loop (which can be terminated with CTRL+C) where the model tells you if it ever hears "Arise." It might look something like this.

Recording New Data

If you would like to record your own data to add to the dataset, you can do so by running the following command.

python ./src/record.py [auto/manual] [saved-dir] [start-point]

You can select between auto or manual mode. In auto mode, the computer will continuously record a set number of 2 second sample audio clips. In manual mode, you will be asked to press Enter to initiate the recording of each 2 second sample.

Then, you can specify where to save these recordings in [saved-dir]. To make the data processing step easier, it would be beneficial if you can save these recordings into the ./Data/train or ./Data/test directories in the appropriately labelled negative or positive directoy.

Finally, start-point tells the program how to start numbering the names of these audio files. For example, if it's 5, then your recordings will be saved as 5.wav, 6.wav, 7.wav and so on.

Data Processing

Run the following command to process the data in the ./Data/train/ and ./Data/test/ directories.

python ./src/process.py

All the data will be labelled and saved as .csv files within the test or train directories (one for train, one for test). Note, these files are not human-readable because they have been pickled in order to be processed faster. However, an example human-readable csv file already exists in ./Data/train/ called train_mfccs2.csv, if you would like to look at how the data is saved.

Training

In order to train the model using the data in ./Data/, simply run the following command.

python ./src/train.py ["save"] [model-name]

Please note that the last two optional arguments can be used if you want to save the model into the ./models/ directory. An example use of this would look ike: python ./src/train.py save ARISE.

Running Saved Model

Finally, you can run your model with the following command.

python src/continuous_listen.py [saved-model-dir]

The last argument specifies where your saved model is.

Further customization

For further customization, simply editing the file paths within record.py, process.py and train.py will get you very far.

abijitj / Wake_Word_Detector