ghunkins / Binaural-Source-Localization-CNN

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Binaural-Source-Localization-CNN

Basic Information


Author: Gregory Hunkins

Organization: University of Rochester

License: MIT

Abstract: A Convolutional Neural Network (CNN) classification system was designed for the task of source localization of human voices in 3-D space. A new dataset, VoiceBin100K, is introduced to accomplish this task and for future work in the field. The CNN inputs variable-length binaurual short- time Fourier Transform (STFT) magnitude and phase features and predicts location of the speaker’s voice according to 168 location classes.

Running The Code


Reference: https://cs.rochester.edu/~cxu22/t/577F17/bluehive_tutorial.html

Data


Please contact ghunkins@u.rochester.edu for access to the data. A public link will available shortly.

About

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

License:MIT License


Languages

Language:Python 98.1%Language:Shell 1.9%