Real-time Speaker Recognition

This repository contains algorithms for real-time speaker recognition applications. It is implemented using either Gaussian Mixture Model or Convolutional Neural Network. For the GMM part, a dynamic threshold can be used to improve the recognition efficiency, but sharply increases the training time.

Usage (GMM)

Enroll wav files into a model.out and then launch the python script RTSP.py:

cd ./GMM
python3 speaker_recognition.py -t enroll -i ./path/to/wav_files_folder/* -m ./your-output-models/model.out
python3 RTSP.py

A prediction is made every three seconds once the model is loaded, for 15 seconds in total. You can modify the duration by changing the while loop, line 103 (tmp < 5).

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

About

This repository contains algorithms for real-time speaker recognition applications. It is implemented using either Gaussian Mixture Model or Convolutional Neural Network.

GNU General Public License v3.0

Languages

Language:Python 100.0%