naruepanart / subject-assistant

A Machine Learning-assisted web app that helps wildlife researchers process Zooniverse Subjects before they're viewed by human volunteers. Also includes a Proxy Server that allows the front end app to bypass CORS issues and maintain secrets.

Home Page:https://subject-assistant.zooniverse.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zooniverse ML Subject Assistant

In short: Machine Learning-assisted web app for processing Zooniverse Subjects.

In long: the Subject Assistant aims to provide (wildlife camera trap-based) project owners an optional Machine Learning-assisted (ML) step in the Subject upload pipeline. Powered by external ML services, project owners can, for example, identify wildlife in photos before passing the difficult ones to volunteers.

https://subject-assistant.zooniverse.org/

  • The Subject Assistant is just the front-end that easily allows Zooniverse project owners to submit their Zooniverse Subjects to certain ML services, and pull the results for further processing.
  • The ML services are external to this project.
  • This repo also contains the Proxy Server, which allows the Subject Assistant (on a *.zooniverse.org domain) to download data from non-Zooniverse domains (i.e. the external ML services), without running into CORS errors.
  • This repo is closely related to Hamlet, which is what actually uploads Subjects to external ML services. (It's a multi-step process that can probably optimised, but for now it works.)

2021/22 Local Development Notes

The current code is optimised for deployment, so some workarounds are required to get the Subject Assistant (and the Proxy Server) working on localhost.

  • Since https://hamlet-staging.zooniverse.org/ points to production and doesn't have a staging equivalent (despite its name!), local development also points to production (!!!)
  • npm start now sets ENV=production
  • The Zooniverse oAuth app now allows localhost as a return URL. (This should be enabled/disabled as necessary!)
  • On local, the Subject Assistant runs on HTTPS (for auth security) but the Proxy Server runs on HTTP (because there's no easy self-hosted SSL solution for Node.js scripts, AFAIK). To allow mixed-content, the local testing must be done on localhost:3000 (and localhost:3666), not the usual alias of local.zooniverse:3000. This is because Chrome & Firefox are much more forgiving of mixed-content on localhost than on other domains.

Usage

Intended Users:

  • Zooniverse Project Owners.

Intended Purpose:

  • This web app is an experiment to see if Machine Learning systems can improve the quality of Subjects uploaded by science teams to the Zooniverse platform.

Intended Usage:

  • The web app should allow Zooniverse project owners to process their Zooniverse Subjects (of wildlife camera trap images) through a Machine Learning (ML) service.
  • These Subjects are then tagged with ML-derived metadata.
  • The Subjects (or a user-selected subset) + their ML-data can then be sent to various endpoints: for example, to a "fast retirement" Zooniverse workflow, or exported as a CSV for further external processing.

Requires:

  • a modern web browser (e.g. Chrome 75+, Firefox 67+) and an Internet connection
  • a familiarity with the Zooniverse crowdsourced research platform
  • preferably, a Zooniverse project that has already been set up with Subjects featuring images of animal from camera traps.

How to Use:

  • Instructions are on the web app.

Dev Notes

Project Type:

  • HTML/JavaScript website/web app
  • plus simple node proxy server

Intended Developers:

  • Web developers (HTML/JS) who are sorta familiar with the Zooniverse dev environment.

Requires:

  • npm - the Node Package Manager, usually installed together with Node

Project Overview:

  • The Front End app...
    • is the main user-facing app.
    • requires the Proxy Server in a live production environment to overcome CORS security issues.
    • has its code stored in /src
    • is hosted on GitHub Pages
    • has a custom domain of http://subject-assistant.zooniverse.org/
    • can be run locally by running npm run start
    • is built by running npm run build and auto-deployed on GitHub Pages as soon as changes are merged to master.
  • The Proxy Server...
    • exists to pass information between the front end and the ML servers, to bypass the CORS security issues that prevents data fetches between different domains.
    • is purely server-side, and is used to hide secrets from the user-facing front end.
    • has its code stored in /server
    • is auto-deployed to our Kubernetes systems (via Jenkins, presumably) as soon as changes are merged to master
    • has a lot of implicit auto-deploy code set up in the /kubernetes

NOTE:

  • By default, the Front End looks for the Proxy Server at https://subject-assistant-proxy.zooniverse.org/

How to Setup:

  • Clone this Github repo into you computer.
  • Open your favourite command line interface (CLI) such as bash.
  • Navigate to this project's directory.
  • Run npm install to install all the dependencies for this project.
  • Run npm start to start the front end web app
  • Run npm run proxy-server to start the proxy server on http://localhost:3666
  • Visit the web app http://localhost:3000
  • Configure the web app (via http://localhost:3000/#/config) to find the proxy server

How to Deploy:

  • Create a branch and open a PR in this repo
  • make your changes, then run npm run build to update the production code for the Front End
  • update the PR and merge
  • changes will be auto-deployed

External dependencies:

  • GitHub Pages for hosting Front End app
  • Zooniverse Kubernetes system for hosting Proxy Server
  • Both set up to use *.zooniverse.org domain names.

Environmental (ENV) Config Values:

  • ORIGINS: acceptable Zooniverse domains, which the Proxy Server accepts requests from. e.g. ORIGINS=https://subject-assistant.zooniverse.org/
  • TARGETS: acceptable external domains/URLs, which the Proxy Server will send requests to. e.g. TARGETS=http://example.com/;http://www.example.com/
  • URL_FOR_MSML: URL for the Microsoft Megadetector ML service. Used by the Proxy Server.
    • Note: as of 2022, the Megedetector ML service is now being hosted on the Zooniverse. The following vars supersede URL_FOR_MSML:
      • CAMERA_TRAPS_API_SERVICE_HOST: hostname for the Zooniverse-hosted Megadetector ML service.
      • CAMERA_TRAPS_API_SERVICE_PATH: path of the Zooniverse-hosted Megadetector ML service.
  • PROXY_HOST: URL of the Proxy Server. Used by the Subject Assistant to find the proxy. Can be overwritten via the Subject Assistant's in-app config.

About

A Machine Learning-assisted web app that helps wildlife researchers process Zooniverse Subjects before they're viewed by human volunteers. Also includes a Proxy Server that allows the front end app to bypass CORS issues and maintain secrets.

https://subject-assistant.zooniverse.org/


Languages

Language:JavaScript 88.7%Language:SCSS 8.0%Language:HTML 2.9%Language:Dockerfile 0.4%