iamjanvijay / Background-Sound-Classification-in-Speech-Audio-Segments

A CNN based approach for sound classification.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sound Classification

Contains code for CNNs and log-mel-spectrograms based sound classification.

What does this code support?

  1. Feature Extraction: Extracts log-mel-spectrograms from audios in UrbanSound8K and stores them in the hdf5 file.

Usage: python utils/features.py logmel --dataset_dir=$URBANSOUND8K_DATASET_DIR --workspace=$WORKSPACE

  1. Training: Trains the model for 2000 iterations at max (can be changed), computes the training and validation accuracy after every 100 iterations. Saves the corresponding logs and models (after every 1000 steps) in $WORKSPACE.

Usage: python pytorch/main_pytorch.py train --workspace='workspace' --validation_fold='10' --validate --cuda

  1. Plotting: There is support to plot confusion matrix, and log-mel-specs as well, look out for file: utils/plot_figures.py and utils/utilities.py

How to set up virtual environment?

Execute the following commands in sequence:

  • pip install virtualenv
  • virtualenv -p python3 venv_noise
  • source venv_noise/bin/activate
  • pip install -r requirements.txt

About

A CNN based approach for sound classification.


Languages

Language:Python 98.1%Language:Shell 1.9%