Bird Species Classification

These are the project files for a master's thesis carried out at Chalmers University of Technology. The aim of the project is to improve upon a state-of-the-art bird species classifier by using deep residual neural networks, multiple-width frequency-delta data augmentation, and meta-data fusion to build and train a bird species classifier on bird song data with corresponding species labels.

Setup

$ git clone https://github.com/johnmartinsson/bird-species-classification
$ virtualenv -p /usr/bin/python3.5 venv
$ source venv/bin/activate
(venv)$ pip install -r requirements.txt

# Ubuntu/Linux 64-bit, CPU only, Python 3.5
(venv)$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp35-cp35m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 3.5
# Requires CUDA toolkit 8.0 and CuDNN v5. For other versions, see "Install from sources" below.
(venv)$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp35-cp35m-linux_x86_64.whl

# Install tensorflow
(venv)$ pip3 install --upgrade $TF_BINARY_URL

Usage Instructions

Below are some usage instructions for how to use the training files, and how to structure the data set.

Preprocess

Firstly, the recordings need to be down sampled.

$ # Resample to 22050 Hz (stand in wav directory)
$ for i in *; do sox $i -r 22050 tmp.wav; mv tmp.wav $i; done

Secondly, the signal parts, and the noise parts of the recordings are extracted and split into three second segments. The signal segments are put in different directories depending on the class given in the xml data, and all nosie segments are put in a separate nosie directory.

$ python preprocess_birdclef.py --xml_dir=<path-to-xml-dir> \
                                --wav_dir=<path-to-wav-dir> \
                                --output_dir=<path-to-output-dir>

Lastly, the data is split into a training set and a validation set:

$ python create_dataset.py --src_dir=<path-to-signal-dir> \
                           --dst_dir=<path-to-destination-dir> \
                           --subset_size=<subset-size> \
                           --valid_percentage=<validation-percentage>

where src points to the signal segments, dst is the destination, subset size is an optional argument which makes training and validation data a randomly chosen subset of the whole data set, and the valid percentage is how many percent the validation data should make up.

Train

$ python train.py --config_file=conf.ini

Run Predictions

$ python run_predictions.py --experiment_path=<path-to-experiment>

Evaluation

$ python evaluate.py --experiment_path=<path-to-results>

Models

In this project two different models have been used: a reimplementation of Elias Sprengels winning solution for the BirdCLEF 2016 challenge, and a Keras implementation of the deep residual neural network.

Libraries

The following libraries are used in this method:

Evaluation Methods

Challenges

This is a collection of bird species classification challenges that, has been, and is carried out around the world.

BirdCLEF: an audio record-based bird identification task

Solutions and Source Code

Rank 1 BirdCLEF 2016 solution description

Bird Audio Detection Challenge

Bird Audio Detection Challenge,
Survey Paper and Discussion,
Blog Article: Generalization in Bird Audio Detection.

MLSP 2013 Bird Classification Challenge

MLSP 2013 Bird Classification Challenge.

Solutions and Source Code

Original compilation source: xuewei4d

Rank 1 solution code and description by beluga,
Rank 2 solution description by Herbal Candy,
Rank 3 solution description by Anil Thomas,
Rank 4 solution description by Maxim Milakov,
Solution thread.

Applications

This is a collection of applications which use this technology.

Warbler

Warbler.

About

Using convolutional neural networks to build and train a bird species classifier on bird song data with corresponding species labels.

MIT License

Languages

Language:Python 99.6%Language:Shell 0.4%