Karthick47v2 / mock-buddy-audio-server

audio processing service for mock-buddy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Welcome to mock-buddy-audio-server 👋

License: MIT

This repo contains audio processing service (microservice) for Mock-Buddy application, uses Flask to build WebSocket and RESTful APIs, you can see project description here. This service is deployed on Heroku.

System will analyze speech and generate reports based on how good user's voice throughout the speech and his/her speech rate. This will highly affect the engagement of audience because if the speaker is not confident about their speech then audience engagement rate will decrease over time. Giving speech without fear and engaging speech instead of monotonous speech are keypoints for increasing audience engagement. And also, speaker needs to be aware of his speech rate will doing presentation/speech, because even if we practice carefully, they may speech at faster rate due to joy or slower rate due to fear.

Workflow is,

  1. Detect speech rate
  2. Detect speech confidence
  • Speech rate

    Speech rate calculated by dividing number of words spoken by time taken. Recorded speech was transcribed using Google Speech-to-Text API and spoken time was calculated accurately with help of VAD.

  • Speech confidence

    Speech confidence score is calculated using speech emotion classifier's output. CNN architecture used to built speech emotion classifier (aka recognition). Model trained on RAVDESS, SAVEES, TESS datasets. Training details are in this repo.

Prerequisite

  • FFmpeg, portaudio19-dev (For audio processing)
  • Google Cloud account
  • Set env variables (GOOGLE_APPLICATION_CREDENTIALS - path to google_credentials.json, GOOGLE_CREDENTIALS - content of file)
  • Python 3.7 or newer

Install

pip install -r requirements.txt

Usage

python3 app.py

Author

👤 Karthick T. Sharma

🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check issues page.

Show your support

Give a ⭐️ if this project helped you!

About

audio processing service for mock-buddy

License:MIT License


Languages

Language:PureBasic 78.0%Language:Python 21.8%Language:Procfile 0.1%Language:Shell 0.1%