Project 1: Active Listening & Learning of Orca Sounds
valentina-s opened this issue · comments
Active Listening & Learning of Orca Sounds
This project will combine the power of humans and machines to achieve superior performance in detecting orca calls.
Labeling audio data is a time consuming and expensive process. With the vast amounts of unlabeled data coming in real-time streams from ocean observing systems (including Orcasound, NSF’s Ocean Observing Initiative, and Canadian equivalents), we have a growing opportunity to engage citizen scientists in evaluating the performance of existing machine learning models, correcting model predictions, and annotating novel sounds which are not only difficult for the machine to predict, but can also be interesting for the listener to discover and study.
This project will build an active learning tool which can visualize the predictions of a machine learning algorithm on new data and allow input from the user to correct and annotate more data, which can then supplement the existing machine learning training dataset. The tool will be modular so that it can ingest data from different sources (live audio data streams or archived data) and predictions from various machine learning algorithms. A data set and model that is ready to be used as a case study in processing archived data is the products from OrcaCNN, a GSoC project led last summer by Jesse in collaboration with NGOS (Dan, and now Hannah).
Required skills: Python, deep learning, interactive visualization
Bonus skills: interest in design, active learning, web app development
Possible mentors: Valentina, Shima, Scott, Val, Jesse, Hanna, Dan
References and open-source building blocks:
-
Pod.Cast (2019 Microsoft effort; data archive)
-
- built on the audio-annotator Github repository
- Orcasound Trello card about APLOSE (including screengrabs)
- NOTE: Using this building block would require Javascript knowledge
-
Annotations in Python:
* http://build.holoviews.org/user_guide/Annotators.html
* Altair library -- a Python library for visualizations. -
Annotations & audio data visualization in Python:
https://github.com/Parisson/TimeSide -
Active Learning for Audio Data: https://ieeexplore.ieee.org/document/8683063
-
Existing classifiers:
- OrcaCNN: https://github.com/axiom-data-science/OrcaCNN/tree/master/Detection/Labelling (Alaska Orcas - prelim)
- Spectral Peaks: https://github.com/orcasound/orcadata/tree/master/WAV_FLACprocessor_Val/ProcessLocalFilesForCalls
-
Other active learning repositories
- https://github.com/google/active-learning (Thanks to Durgesh Kumar Singh for discovering this repo!)
- https://github.com/janfreyberg/superintendent (Thanks to Nikhil Rana for discovering this repo!)
- Libact (Python) (Thanks to Durgesh Kumar Singh for discovering this resource!)
- modAL (active learning framework for Python3) (Thanks to Durgesh Kumar Singh for discovering this resource!)
Points to consider in the proposal:
Design of the system:
- how will the raw data be stored?
- how will the outputs of the classifier be stored, how will the labels be stored?
- how will you keep track of the different runs of the algorithm?
Visualizations:
- visualize algorithm performance: what kind of mistakes does the algorithm make?
- calculate summary statistics from detections/labels over a period of time (# observations per period of time)
- how can a user quickly scan through the results of the algorithm?
Getting started (these are suggestions, there is no one way to go):
- Create a colab notebook which reads an audio file and allows the user to annotate sounds using the holoviews annotator (for Python fans)
- Use the model from this notebook to obtain some detections with low confidence and display them for manual review to the user (using the holoviews tools). After annotation store the annotated snapshot the same format as the training set.
- Take the example model from the OrcaCNN repo and test whether prediction can be run in the browser using tensorflow.js. You can start with an .png image of the spectrogram, but here are also some audio examples for inspiration: audio demo, audio tutorial (for JavasScript fans)
- Look at the browser annotator tools such as Audio Annotator, APLOSE, and study what components will be useful for the active learning system and how they can be integrated together if needed (for JavaScript fans)
Hi Mentors & Community Members,
Please find the link for my GSoC proposal.
https://docs.google.com/document/d/1oSDan6jUgGKNFbOScbWqvO3iD_bV0vrxv-exvJ78Bj0/edit#heading=h.4aqsbyic49kz
I'm sharing my draft proposal hoping to receive feedback from the mentors.
https://docs.google.com/document/d/1GN0PSj9fH3UZWBBf7qlegApGGlJGV8NJMyitUAXDt3Q/edit?usp=sharing
Hey mentors and community members, I have shared my draft proposal. Would be excited to know your feedback regarding this proposal https://docs.google.com/document/d/1kNVWR6K9uB5MviNr1cASZ3_bkW3xBWWCm-uTOwncyEI/edit?usp=sharing
Thank you
Hi mentors, here's the link to my proposal draft. Any comments and feedback will be appreciated.
https://docs.google.com/document/d/153Y5l4UtQL3co95tWWizOiUd_JqtS98g92ClNuKPnq0/edit?usp=sharing
Thank you.
Hello and namaste all mentors, here's the link to my proposal .
https://docs.google.com/document/d/1rIJskrrCy44e9zYIZM-l5mcpAtrRLt0cuQlmqrNUwVk/edit?usp=sharing
I know I am late but I shall be highly obliged and appreciate some of your time to review it and provide quick feedback.
Thank you