rafelafrance / phenobase

Classifiers for the Phenobase project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

phenobase Python application

Classifiers for identifying phenology traits on images of herbarium sheets.

There is a lot of effort to digitize and annotate photographs of plant images and herbarium specimens. However, this effort is, up until now, mostly manual, error-prone, and labor-intensive resulting in only a fraction of these images being fully annotated. This project uses neural networks to automate the annotation of some biologically significant traits related to phenology: flowering, fruiting, leaf-out, etc.

The basic steps are:

  1. Obtain a database of plant images with corresponding annotations.
    1. I'm using data from the iDigBio project to get the URL of images to download.
      1. Clean the database to only contain records with a single Angiosperm herbarium sheet, that also contain phenology annotations.
    2. We can either use the records from above that are pre-identified or have experts annotate the images. The later is preferable.
  2. Train a neural network(s) to recognize the traits. We are using the pytorch library to build the neural networks. I am also, using models and scripts from HuggingFace.
    1. Because it can be difficult to get a significant amount of quality annotations I'm using masked autoencoders for a pretraining step.
    2. Use the encoding part of the masked autoencoder as a backbone for the actual phenology trait classifier.
  3. Use the trained neural networks to annotate images en masse.

Stay tuned

Coming soon!

  • More thrills
  • More spills
  • More explanations of what I'm actually doing here.

Setup

  1. git clone https://github.com/rafelafrance/phenobase.git
  2. cd phenobase
  3. make install

About

Classifiers for the Phenobase project

License:MIT License


Languages

Language:Python 92.5%Language:Shell 6.9%Language:Makefile 0.6%