cnn computer-vision fastai grad-cam-visualization huggingface pytorch xai

An Interpretable Bengali Fish Recognizer

Project Development Journal

`Problem Statement`

A bengali fish image recognizer that can classify in between: -

Ayre	Catla	Chital	Ilish


Kachki	Kajoli	Koi	Magur

Mola Dhela	Mrigal	Pabda	Pangash

Poa	Puti	Rui	Shing

Silver Carp	Taki	Telapia	Tengra

`Objective`

There are many kinds of fishes sold in Bangladeshi fish markets. Each of them has their own characteristics.But there are a few that sometimes look alike but can be differentiated using tiny details. Such as, A Catla fish's head size is larger than the Rui's but other features are almost same. Ayre and Pangash both have same body structure but an ayre fish has hackeles whereas pangash doesn't fall in catfish category. There are many fishes like that. The main goal of this recognizer project is to categorize in between them.

`Data Collection`

A bit of exploration made me know that searching a fish with its scientific name gives better result and provides with more accurate images. So, I mapped each fish's bengali and scientific name within a dictionary. Then using fastAI's DuckDuckGo searching, running a loop within the dictionary, I collected images for each category and kept them in their respective folders. You will found the whole procedure within the "data_collection" notebook.

`Data Cleaning`

There were many redundant images within many categories. In some cases, images were mixed up between catgories. So, before training I not only had to delete those redundant images but also needed to move images to their respective folders. Still, as humen we sometimes fail to do each thing perfect.

After model training, when the results were not satisfactory, I found the classes with most losses and repeated the cleaning process. Therefore, a noticeable change can be found in image numbers in the case of some categories. Such as, previously the class Rui had less images whereas after cleaning it contains the most number of images. Some classes like Shing, Silver Carp, Taki faced the decreasing number of images. In the end, if you look at the images distribution table, you will find out that it turned out to be an imbalanced dataset. Check out the final dataset here.

Images Distribution
Before Cleaning	After Cleaning

`Dataloader Preparation`

I splitted the whole dataset with 80% for training purpose and 20& as validation set. I prepared the dataloader with a batch size of 32. After training I viewed the top losses and cleaned data for multiple times to get a better result as well as created multiple versions of dataloader.

`Models Experimentations`

To create the classifier, I chose some pre-trained and well-performing computer vision models with feature extractors avaialable in the fastAI and trained them. I selected these models from my previous research-based experience. The choosen ones are: -

VGG-19

DenseNet-121

ResNet-50

Training process for each model: -

Firstly, I freezed the pre-trained layers for each model.
Secondly I found the suitable learning rate range using fastai's lr_find.
I trained the models for 30 epochs for both fit_one_cycle using learning rate range and fine_tune method using auto learning rate tuning.
Lastly, I unfreezed the models and repeated the processes from 2-3.

`Performance Evaluation`

Among all the experimentations, for ResNet-50, I got better performance using the fine_tune method whereas for other ones, using fit_one_cycle method gives a good result. The following table contains the accuracies from the better performing experiments.

Model	Accuracy(%)
Resnet-50	81.379
VGG-19	77.471
Densenet-121	80.229

`Explainablity`

To interpret the model's performances, I applied Grad-CAM, a gradient-based method. Within an image, we can find which region was found important by a model for the predicted class.

Correctly classified visualizations
Actual Image	ResNet-50	DenseNet-121	VGG-19

Rui	Rui	Rui	Rui
Mis-classified visualizations
Actual Image	ResNet-50	DenseNet-121	VGG-19

Koi	Telapia	Taki	Telapia

By looking at the Resnet-50's xai mask, we can say that it tries to find characteristics properly. For example, in the case of correctly classified image, it's locating regions nearby tail and head, even in the case of misclassification, it's marking more features within body area. But in the case of DenseNet-121, the masked areas are scattered more at outiside in both the cases. Vgg-19 while correctly classifying is locating the middle body area but more areas were masked than the required region. For the misclassification, it located within the image but stil missed the correct label.

`Deployment`

As ResNet-50 was showing better performance than others. So,I deployed the recognizer using gradio app within Huggingface. Check out the deployment & required files for the deployment.

`Integration`

The recognizer model is integrated using github pages and jekyll remote theme.
Check out my website ingtegration of .

Recognizer Webpage View

`Short Video Demonstration`

I prepared a short video demonstration and shared it as a linked in post. Check it out here.

About

Collecting fish image data, after training classifiers grad-cam is applied for the prediction interpretation

https://neloy-barman.github.io/Interpretable-Bengali-Fish-Recognizer/

cnn computer-vision fastai grad-cam-visualization huggingface pytorch xai

MIT License

Languages

Language:Jupyter Notebook 100.0%Language:Python 0.0%