A bengali fish image recognizer that can classify in between: -
Ayre
Catla
Chital
Ilish
Kachki
Kajoli
Koi
Magur
Mola Dhela
Mrigal
Pabda
Pangash
Poa
Puti
Rui
Shing
Silver Carp
Taki
Telapia
Tengra
Objective
There are many kinds of fishes sold in Bangladeshi fish markets. Each of them has their own characteristics.But there are a few that sometimes look alike but can be differentiated using tiny details. Such as, A Catla fish's head size is larger than the Rui's but other features are almost same. Ayre and Pangash both have same body structure but an ayre fish has hackeles whereas pangash doesn't fall in catfish category. There are many fishes like that. The main goal of this recognizer project is to categorize in between them.
Data Collection
A bit of exploration made me know that searching a fish with its scientific name gives better result and provides with more accurate images. So, I mapped each fish's bengali and scientific name within a dictionary. Then using fastAI's DuckDuckGo searching, running a loop within the dictionary, I collected images for each category and kept them in their respective folders. You will found the whole procedure within the "data_collection" notebook.
Data Cleaning
There were many redundant images within many categories. In some cases, images were mixed up between catgories. So, before training I not only had to delete those redundant images but also needed to move images to their respective folders. Still, as humen we sometimes fail to do each thing perfect.
After model training, when the results were not satisfactory, I found the classes with most losses and repeated the cleaning process. Therefore, a noticeable change can be found in image numbers in the case of some categories. Such as, previously the class Rui had less images whereas after cleaning it contains the most number of images. Some classes like Shing, Silver Carp, Taki faced the decreasing number of images. In the end, if you look at the images distribution table, you will find out that it turned out to be an imbalanced dataset. Check out the final dataset here.
Images Distribution
Before Cleaning
After Cleaning
Dataloader Preparation
I splitted the whole dataset with 80% for training purpose and 20& as validation set. I prepared the dataloader with a batch size of 32. After training I viewed the top losses and cleaned data for multiple times to get a better result as well as created multiple versions of dataloader.
Models Experimentations
To create the classifier, I chose some pre-trained and well-performing computer vision models with feature extractors avaialable in the fastAI and trained them. I selected these models from my previous research-based experience. The choosen ones are: -
VGG-19
DenseNet-121
ResNet-50
Training process for each model: -
Firstly, I freezed the pre-trained layers for each model.
Secondly I found the suitable learning rate range using fastai's lr_find.
I trained the models for 30 epochs for both fit_one_cycle using learning rate range and fine_tune method using auto learning rate tuning.
Lastly, I unfreezed the models and repeated the processes from 2-3.
Performance Evaluation
Among all the experimentations, for ResNet-50, I got better performance using the fine_tune method whereas for other ones, using fit_one_cycle method gives a good result. The following table contains the accuracies from the better performing experiments.
Model
Accuracy(%)
Resnet-50
81.379
VGG-19
77.471
Densenet-121
80.229
Explainablity
To interpret the model's performances, I applied Grad-CAM, a gradient-based method. Within an image, we can find which region was found important by a model for the predicted class.
Correctly classified visualizations
Actual Image
ResNet-50
DenseNet-121
VGG-19
Rui
Rui
Rui
Rui
Mis-classified visualizations
Actual Image
ResNet-50
DenseNet-121
VGG-19
Koi
Telapia
Taki
Telapia
By looking at the Resnet-50's xai mask, we can say that it tries to find characteristics properly. For example, in the case of correctly classified image, it's locating regions nearby tail and head, even in the case of misclassification, it's marking more features within body area. But in the case of DenseNet-121, the masked areas are scattered more at outiside in both the cases. Vgg-19 while correctly classifying is locating the middle body area but more areas were masked than the required region. For the misclassification, it located within the image but stil missed the correct label.
Deployment
As ResNet-50 was showing better performance than others. So,I deployed the recognizer using gradio app within Huggingface. Check out the deployment & required files for the deployment.
Integration
The recognizer model is integrated using github pages and jekyll remote theme.
Check out my website ingtegration of .
Recognizer Webpage View
Short Video Demonstration
I prepared a short video demonstration and shared it as a linked in post. Check it out here.
About
Collecting fish image data, after training classifiers grad-cam is applied for the prediction interpretation