ruohoruotsi / LSTM-Music-Genre-Classification

Music genre classification with LSTM Recurrent Neural Nets in Keras & PyTorch

audio-features-extracted classification genre gtzan-dataset keras lstm music music-genre-classification python3 pytorch rnn

Music Genre Classification with LSTMs

Classify music files based on genre from the GTZAN music corpus
GTZAN corpus is included for easy of use
Use multiple layers of LSTM Recurrent Neural Nets
Implementations in PyTorch, PyTorch-Lightning, Keras

Test trained LSTM model

In the ./weights/ you can find trained model weights and model architecture.

To test the model on your custom audio file, run

 python3 predict_example.py path/to/custom/file.mp3

or to test the model on our custom files, run

 python3 predict_example.py audio/classical_music.mp3

Audio features extracted

Dependencies

Python3
numpy
librosa → for audio feature extraction
Keras
- pip install keras
PyTorch
- pip install torch torchvision
- brew install libomp

Ideas for improving accuracy:

GTZAN dataset has problems, how do we use it with consideration?
Normalize MFCCs & other input features (Recurrent BatchNorm?)
Decay learning rate
How are we initing the weights?
Better optimization hyperparameters (too little dropout)
Do you have avoidable bias? How's your variance?

Accuracy

At Epoch 400, training on a TITAN X GPU (October 2017):

	Loss	Accuracy
Training	`0.5801`	`0.7810`
Validation	`0.734523485104`	`0.766666688025`
Testing	`0.900845060746`	`0.683333342274`

At Epoch 400, training on a 2018 Macbook Pro CPU (May 2019):

	Loss	Accuracy
Training	`0.3486`	`0.8738`
Validation	`1.028421084086`	`0.700000017881`
Testing	`1.209656755129`	`0.683333347241`

About

Music genre classification with LSTM Recurrent Neural Nets in Keras & PyTorch

audio-features-extracted classification genre gtzan-dataset keras lstm music music-genre-classification python3 pytorch rnn

MIT License

Languages

Language:Python 100.0%