yanirs / ichthywhat

Experimenting with deep learning for fish ID

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ichthy-what? Fishy photo ID

Experimenting with deep learning for fish identification with Reef Life Survey data.

Setup

Prerequisites: Set up Python 3.10 (e.g., with pyenv) and Poetry.

Set up the Poetry environment:

$ poetry install

Install pre-commit hooks:

$ poetry run pre-commit install

Alternatively, install Vagrant and run everything in a virtual machine:

$ vagrant up

As a shortcut for running various development servers on the Vagrant machine, you can run:

$ vagrant provision --provision-with run-servers

Fish ID Streamlit app

Run via streamlit in local development mode (run on save, use local species images, and expose beta features – the ichthywhat.localhost address is needed for the Mapbox API to work and a mapping should exist in /etc/hosts):

$ poetry run streamlit run --browser.serverAddress ichthywhat.localhost --server.runOnSave true ichthywhat/app.py -- dev /path/to/img/root

Run via streamlit in production mode:

$ poetry run streamlit run ichthywhat/app.py

Build a new model by running the code in notebooks/03-app.ipynb.

Fish ID classification API

The Vagrant machine exposes a simple classification API. With the machine running, call:

$ curl -X POST -F "img_file=@image-file-path.jpg" http://localhost:9300/predict | jq

Alternatively, visit http://localhost:9300/demo for a simple demo page.

This API is also packaged in a Dockerfile, which can be built on the Vagrant machine:

$ podman build -t ichthywhat .

...and run with the default port exposed to the host:

$ podman run --env UVICORN_HOST=0.0.0.0 -p 8000:8000 localhost/ichthywhat:latest

...and exported elsewhere:

$ podman save localhost/ichthywhat:latest | gzip > ichthywhat-img.tar.gz

...then on another machine that has Docker, perhaps with a local proxy:

$ docker load --input ichthywhat-img.tar.gz
$ docker run --env UVICORN_HOST=0.0.0.0 -p 127.0.0.1:8000:8000 localhost/ichthywhat:latest

See the ARG and ENV calls in the Dockerfile for customisation options.

Jupyter notebooks used for experimentation and model building

$ poetry run jupyter notebook

Experiment monitoring

Use MLflow:

$ poetry run mlflow ui --backend-store-uri sqlite:///mlruns.db

Command line interface

Create an RLS species dataset (legacy – from local files):

$ poetry run ichthywhat create-rls-species-dataset-from-local \
    --m1-csv-path ~/projects/fish-id/data/dump-20210717/m1.csv \
    --image-dir ~/projects/yanirs.github.io/tools/rls/img \
    --output-dir data/rls-species-25-min-images-3/ \
    --num-species 25 \
    --min-images-per-species 3

Create an RLS species dataset (new – from API):

$ poetry run ichthywhat create-rls-species-dataset-from-api \
    --output-dir data/rls-species-m1/

Create an RLS genus dataset:

$ poetry run ichthywhat create-rls-genus-dataset \
    --image-dir ~/projects/yanirs.github.io/tools/rls/img \
    --output-dir data/rls-top-5-genera \
    --num-top-genera 5

Create a test dataset from a trip directory:

$ poetry run ichthywhat create-test-dataset \
    --trip-dir ~/Pictures/202010\ Eviota\ GBR \
    --output-dir data/eviota-202010

About

Experimenting with deep learning for fish ID


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%Language:Dockerfile 0.0%