Neloy-Barman / Interpretable-Bengali-Fish-Recognizer

Collecting fish image data, after training classifiers grad-cam is applied for the prediction interpretation

Home Page:https://neloy-barman.github.io/Interpretable-Bengali-Fish-Recognizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An Interpretable Bengali Fish Recognizer

Project Development Journal

Problem Statement

A bengali fish image recognizer that can classify in between: -

Ayre Catla Chital Ilish
Kachki Kajoli Koi Magur
Mola Dhela Mrigal Pabda Pangash
Poa Puti Rui Shing
Silver Carp Taki Telapia Tengra

Objective

There are many kinds of fishes sold in Bangladeshi fish markets. Each of them has their own characteristics.But there are a few that sometimes look alike but can be differentiated using tiny details. Such as, A Catla fish's head size is larger than the Rui's but other features are almost same. Ayre and Pangash both have same body structure but an ayre fish has hackeles whereas pangash doesn't fall in catfish category. There are many fishes like that. The main goal of this recognizer project is to categorize in between them.

Data Collection

A bit of exploration made me know that searching a fish with its scientific name gives better result and provides with more accurate images. So, I mapped each fish's bengali and scientific name within a dictionary. Then using fastAI's DuckDuckGo searching, running a loop within the dictionary, I collected images for each category and kept them in their respective folders. You will found the whole procedure within the "data_collection" notebook.

Data Cleaning

There were many redundant images within many categories. In some cases, images were mixed up between catgories. So, before training I not only had to delete those redundant images but also needed to move images to their respective folders. Still, as humen we sometimes fail to do each thing perfect.

After model training, when the results were not satisfactory, I found the classes with most losses and repeated the cleaning process. Therefore, a noticeable change can be found in image numbers in the case of some categories. Such as, previously the class Rui had less images whereas after cleaning it contains the most number of images. Some classes like Shing, Silver Carp, Taki faced the decreasing number of images. In the end, if you look at the images distribution table, you will find out that it turned out to be an imbalanced dataset. Check out the final dataset here.

Images Distribution
Before Cleaning After Cleaning

Dataloader Preparation

I splitted the whole dataset with 80% for training purpose and 20& as validation set. I prepared the dataloader with a batch size of 32. After training I viewed the top losses and cleaned data for multiple times to get a better result as well as created multiple versions of dataloader.

Models Experimentations

To create the classifier, I chose some pre-trained and well-performing computer vision models with feature extractors avaialable in the fastAI and trained them. I selected these models from my previous research-based experience. The choosen ones are: -
  • VGG-19
  • DenseNet-121
  • ResNet-50
Training process for each model: -
  1. Firstly, I freezed the pre-trained layers for each model.
  2. Secondly I found the suitable learning rate range using fastai's lr_find.
  3. I trained the models for 30 epochs for both fit_one_cycle using learning rate range and fine_tune method using auto learning rate tuning.
  4. Lastly, I unfreezed the models and repeated the processes from 2-3.

Performance Evaluation

Among all the experimentations, for ResNet-50, I got better performance using the fine_tune method whereas for other ones, using fit_one_cycle method gives a good result. The following table contains the accuracies from the better performing experiments.
Model Accuracy(%)
Resnet-50 81.379
VGG-19 77.471
Densenet-121 80.229

Explainablity

To interpret the model's performances, I applied Grad-CAM, a gradient-based method. Within an image, we can find which region was found important by a model for the predicted class.
Correctly classified visualizations
Actual Image ResNet-50 DenseNet-121 VGG-19

Rui

Rui

Rui

Rui

Mis-classified visualizations
Actual Image ResNet-50 DenseNet-121 VGG-19

Koi

Telapia

Taki

Telapia

By looking at the Resnet-50's xai mask, we can say that it tries to find characteristics properly. For example, in the case of correctly classified image, it's locating regions nearby tail and head, even in the case of misclassification, it's marking more features within body area. But in the case of DenseNet-121, the masked areas are scattered more at outiside in both the cases. Vgg-19 while correctly classifying is locating the middle body area but more areas were masked than the required region. For the misclassification, it located within the image but stil missed the correct label.

Deployment

As ResNet-50 was showing better performance than others. So,I deployed the recognizer using gradio app within Huggingface. Check out the deployment & required files for the deployment.

Integration

The recognizer model is integrated using github pages and jekyll remote theme.
Check out my website ingtegration of .

Recognizer Webpage View

Short Video Demonstration

I prepared a short video demonstration and shared it as a linked in post. Check it out here.

About

Collecting fish image data, after training classifiers grad-cam is applied for the prediction interpretation

https://neloy-barman.github.io/Interpretable-Bengali-Fish-Recognizer/

License:MIT License


Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%