BSc course: Speech and Audio Processing
Semester: 8
Project Completion Year: 2024
This project implements a word detection system that segments spoken sentences into individual words using various classifiers. The system is designed to analyze audio recordings of speech and return the time intervals of the detected words. It aims to provide an efficient and speaker-independent solution for speech recognition tasks, utilizing classifiers such as Least Squares, SVM, RNN, and MLP.
- Clone the repository:
git clone https://github.com/dimitrisstyl7/speech-and-audio-processing-project.git
- Navigate to the project directory:
cd speech-and-audio-processing-project
- Create and activate a virtual environment:
On Linux/Mac
python3 -m venv venv
source venv/bin/activate
On Windows
python -m venv venv
venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Run the program - Classifiers Training:
On Linux/Mac
python3 train_classifiers.py
On Windows
python train_classifiers.py
- Run the program - Word Detection:
On Linux/Mac
python3 word_detector.py
On Windows
python word_detector.py
- To successfully play audio through the
word_detector.py
program, you must have VLC media player installed on your computer.
This project was developed as part of the "Speech and Audio Processing" BSc course at the University of Piraeus. Contributions and feedback are always welcome!
This project is licensed under the MIT License - see the LICENSE file for details.