Project 1: Active Listening & Learning of Orca Sounds

Question

Project 1: Active Listening & Learning of Orca Sounds

valentina-s opened this issue 5 years ago · comments

Active Listening & Learning of Orca Sounds

This project will combine the power of humans and machines to achieve superior performance in detecting orca calls.

Labeling audio data is a time consuming and expensive process. With the vast amounts of unlabeled data coming in real-time streams from ocean observing systems (including Orcasound, NSF’s Ocean Observing Initiative, and Canadian equivalents), we have a growing opportunity to engage citizen scientists in evaluating the performance of existing machine learning models, correcting model predictions, and annotating novel sounds which are not only difficult for the machine to predict, but can also be interesting for the listener to discover and study.

This project will build an active learning tool which can visualize the predictions of a machine learning algorithm on new data and allow input from the user to correct and annotate more data, which can then supplement the existing machine learning training dataset. The tool will be modular so that it can ingest data from different sources (live audio data streams or archived data) and predictions from various machine learning algorithms. A data set and model that is ready to be used as a case study in processing archived data is the products from OrcaCNN, a GSoC project led last summer by Jesse in collaboration with NGOS (Dan, and now Hannah).

Required skills: Python, deep learning, interactive visualization

Bonus skills: interest in design, active learning, web app development

Possible mentors: Valentina, Shima, Scott, Val, Jesse, Hanna, Dan

References and open-source building blocks:

Pod.Cast (2019 Microsoft effort; data archive)
APLOSE
- built on the audio-annotator Github repository
- Orcasound Trello card about APLOSE (including screengrabs)
- NOTE: Using this building block would require Javascript knowledge
Annotations in Python:
* http://build.holoviews.org/user_guide/Annotators.html
* Altair library -- a Python library for visualizations.
Annotations & audio data visualization in Python:
https://github.com/Parisson/TimeSide
Active Learning for Audio Data: https://ieeexplore.ieee.org/document/8683063
Existing classifiers:
- OrcaCNN: https://github.com/axiom-data-science/OrcaCNN/tree/master/Detection/Labelling (Alaska Orcas - prelim)
- Spectral Peaks: https://github.com/orcasound/orcadata/tree/master/WAV_FLACprocessor_Val/ProcessLocalFilesForCalls
Other active learning repositories
- https://github.com/google/active-learning (Thanks to Durgesh Kumar Singh for discovering this repo!)
- https://github.com/janfreyberg/superintendent (Thanks to Nikhil Rana for discovering this repo!)
- Libact (Python) (Thanks to Durgesh Kumar Singh for discovering this resource!)
- modAL (active learning framework for Python3) (Thanks to Durgesh Kumar Singh for discovering this resource!)
Convert keras models to ONNX

Points to consider in the proposal:

Design of the system:

how will the raw data be stored?
how will the outputs of the classifier be stored, how will the labels be stored?
how will you keep track of the different runs of the algorithm?

Visualizations:

visualize algorithm performance: what kind of mistakes does the algorithm make?
calculate summary statistics from detections/labels over a period of time (# observations per period of time)
how can a user quickly scan through the results of the algorithm?

Getting started (these are suggestions, there is no one way to go):

Create a colab notebook which reads an audio file and allows the user to annotate sounds using the holoviews annotator (for Python fans)
Use the model from this notebook to obtain some detections with low confidence and display them for manual review to the user (using the holoviews tools). After annotation store the annotated snapshot the same format as the training set.
Take the example model from the OrcaCNN repo and test whether prediction can be run in the browser using tensorflow.js. You can start with an .png image of the spectrogram, but here are also some audio examples for inspiration: audio demo, audio tutorial (for JavasScript fans)
Look at the browser annotator tools such as Audio Annotator, APLOSE, and study what components will be useful for the active learning system and how they can be integrated together if needed (for JavaScript fans)

Durgesh Singh · Answer 1 · Fri Mar 20 2020 14:51:09 GMT+0800 (China Standard Time)

Hi Mentors & Community Members,
Please find the link for my GSoC proposal.
https://docs.google.com/document/d/1oSDan6jUgGKNFbOScbWqvO3iD_bV0vrxv-exvJ78Bj0/edit#heading=h.4aqsbyic49kz

Diego R. Saltijeral · Answer 2 · Sun Mar 22 2020 04:31:47 GMT+0800 (China Standard Time)

I'm sharing my draft proposal hoping to receive feedback from the mentors.
https://docs.google.com/document/d/1GN0PSj9fH3UZWBBf7qlegApGGlJGV8NJMyitUAXDt3Q/edit?usp=sharing

kunal mehta · Answer 3 · Wed Mar 25 2020 13:45:40 GMT+0800 (China Standard Time)

Hey mentors and community members, I have shared my draft proposal. Would be excited to know your feedback regarding this proposal https://docs.google.com/document/d/1kNVWR6K9uB5MviNr1cASZ3_bkW3xBWWCm-uTOwncyEI/edit?usp=sharing
Thank you

Hamna Moieez · Answer 4 · Wed Mar 25 2020 17:24:56 GMT+0800 (China Standard Time)

Hi mentors, here's the link to my proposal draft. Any comments and feedback will be appreciated.
https://docs.google.com/document/d/153Y5l4UtQL3co95tWWizOiUd_JqtS98g92ClNuKPnq0/edit?usp=sharing
Thank you.

tarun · Answer 5 · Tue Mar 31 2020 22:27:23 GMT+0800 (China Standard Time)

Hello and namaste all mentors, here's the link to my proposal .
https://docs.google.com/document/d/1rIJskrrCy44e9zYIZM-l5mcpAtrRLt0cuQlmqrNUwVk/edit?usp=sharing
I know I am late but I shall be highly obliged and appreciate some of your time to review it and provide quick feedback.
Thank you