orcasound / orcagsoc

Google Summer of Code projects and products related to Orcasound & orca sounds

Home Page:http://orcasound.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project 1: Active Listening & Learning of Orca Sounds

valentina-s opened this issue · comments

Active Listening & Learning of Orca Sounds

This project will combine the power of humans and machines to achieve superior performance in detecting orca calls.

Labeling audio data is a time consuming and expensive process. With the vast amounts of unlabeled data coming in real-time streams from ocean observing systems (including Orcasound, NSF’s Ocean Observing Initiative, and Canadian equivalents), we have a growing opportunity to engage citizen scientists in evaluating the performance of existing machine learning models, correcting model predictions, and annotating novel sounds which are not only difficult for the machine to predict, but can also be interesting for the listener to discover and study.

This project will build an active learning tool which can visualize the predictions of a machine learning algorithm on new data and allow input from the user to correct and annotate more data, which can then supplement the existing machine learning training dataset. The tool will be modular so that it can ingest data from different sources (live audio data streams or archived data) and predictions from various machine learning algorithms. A data set and model that is ready to be used as a case study in processing archived data is the products from OrcaCNN, a GSoC project led last summer by Jesse in collaboration with NGOS (Dan, and now Hannah).

Required skills: Python, deep learning, interactive visualization

Bonus skills: interest in design, active learning, web app development

Possible mentors: Valentina, Shima, Scott, Val, Jesse, Hanna, Dan

References and open-source building blocks:

Points to consider in the proposal:

Design of the system:

  • how will the raw data be stored?
  • how will the outputs of the classifier be stored, how will the labels be stored?
  • how will you keep track of the different runs of the algorithm?

Visualizations:

  • visualize algorithm performance: what kind of mistakes does the algorithm make?
  • calculate summary statistics from detections/labels over a period of time (# observations per period of time)
  • how can a user quickly scan through the results of the algorithm?

Getting started (these are suggestions, there is no one way to go):

  • Create a colab notebook which reads an audio file and allows the user to annotate sounds using the holoviews annotator (for Python fans)
  • Use the model from this notebook to obtain some detections with low confidence and display them for manual review to the user (using the holoviews tools). After annotation store the annotated snapshot the same format as the training set.
  • Take the example model from the OrcaCNN repo and test whether prediction can be run in the browser using tensorflow.js. You can start with an .png image of the spectrogram, but here are also some audio examples for inspiration: audio demo, audio tutorial (for JavasScript fans)
  • Look at the browser annotator tools such as Audio Annotator, APLOSE, and study what components will be useful for the active learning system and how they can be integrated together if needed (for JavaScript fans)

I'm sharing my draft proposal hoping to receive feedback from the mentors.
https://docs.google.com/document/d/1GN0PSj9fH3UZWBBf7qlegApGGlJGV8NJMyitUAXDt3Q/edit?usp=sharing

Hey mentors and community members, I have shared my draft proposal. Would be excited to know your feedback regarding this proposal https://docs.google.com/document/d/1kNVWR6K9uB5MviNr1cASZ3_bkW3xBWWCm-uTOwncyEI/edit?usp=sharing
Thank you

Hi mentors, here's the link to my proposal draft. Any comments and feedback will be appreciated.
https://docs.google.com/document/d/153Y5l4UtQL3co95tWWizOiUd_JqtS98g92ClNuKPnq0/edit?usp=sharing
Thank you.

commented

Hello and namaste all mentors, here's the link to my proposal .
https://docs.google.com/document/d/1rIJskrrCy44e9zYIZM-l5mcpAtrRLt0cuQlmqrNUwVk/edit?usp=sharing
I know I am late but I shall be highly obliged and appreciate some of your time to review it and provide quick feedback.
Thank you