audio-processing python speech-and-audio-processing speech-recognition

Speech and Audio Processing: Word Detection System

University of Piraeus | Department of Informatics

BSc course: Speech and Audio Processing

Semester: 8

Project Completion Year: 2024

Description

This project implements a word detection system that segments spoken sentences into individual words using various classifiers. The system is designed to analyze audio recordings of speech and return the time intervals of the detected words. It aims to provide an efficient and speaker-independent solution for speech recognition tasks, utilizing classifiers such as Least Squares, SVM, RNN, and MLP.

How to Run

Clone the repository:

git clone https://github.com/dimitrisstyl7/speech-and-audio-processing-project.git

Navigate to the project directory:

cd speech-and-audio-processing-project

Create and activate a virtual environment:

On Linux/Mac

python3 -m venv venv
source venv/bin/activate

On Windows

python -m venv venv
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Run the program - Classifiers Training:

On Linux/Mac

python3 train_classifiers.py

On Windows

python train_classifiers.py

Run the program - Word Detection:

On Linux/Mac

python3 word_detector.py

On Windows

python word_detector.py

Notes

To successfully play audio through the word_detector.py program, you must have VLC media player installed on your computer.

Acknowledgments

This project was developed as part of the "Speech and Audio Processing" BSc course at the University of Piraeus. Contributions and feedback are always welcome!

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Word detection system that segments spoken sentences into individual words using various classifiers

audio-processing python speech-and-audio-processing speech-recognition

MIT License

Languages

Language:Python 100.0%