dimitrisstyl7 / speech-and-audio-processing-project

Word detection system that segments spoken sentences into individual words using various classifiers

Repository from Github https://github.comdimitrisstyl7/speech-and-audio-processing-projectRepository from Github https://github.comdimitrisstyl7/speech-and-audio-processing-project

Speech and Audio Processing: Word Detection System

BSc course: Speech and Audio Processing

Semester: 8

Project Completion Year: 2024

Description

This project implements a word detection system that segments spoken sentences into individual words using various classifiers. The system is designed to analyze audio recordings of speech and return the time intervals of the detected words. It aims to provide an efficient and speaker-independent solution for speech recognition tasks, utilizing classifiers such as Least Squares, SVM, RNN, and MLP.

How to Run

  1. Clone the repository:
git clone https://github.com/dimitrisstyl7/speech-and-audio-processing-project.git
  1. Navigate to the project directory:
cd speech-and-audio-processing-project
  1. Create and activate a virtual environment:

On Linux/Mac

python3 -m venv venv
source venv/bin/activate

On Windows

python -m venv venv
venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Run the program - Classifiers Training:

On Linux/Mac

python3 train_classifiers.py

On Windows

python train_classifiers.py
  1. Run the program - Word Detection:

On Linux/Mac

python3 word_detector.py

On Windows

python word_detector.py

Notes

  • To successfully play audio through the word_detector.py program, you must have VLC media player installed on your computer.

Acknowledgments

This project was developed as part of the "Speech and Audio Processing" BSc course at the University of Piraeus. Contributions and feedback are always welcome!

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Word detection system that segments spoken sentences into individual words using various classifiers

License:MIT License


Languages

Language:Python 100.0%