knwng / DogvsCat

CNN-based Dog&Cat classification model as a course project for Introduction to Artificial Intelligence

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dog&Cat image classifier for course: Introduction to Artificial Intelligence

Requirement


  • Python 2.7
  • Tensorflow 1.4.1

Requirements can be installed by

pip install -r requirements.txt

Use virtualenv venv to manage python virtual environment is a better choice

Dataset


​ We use dataset from here, you may also want to use some external data like The Oxford-IIIT Pet Dataset

​ If you want to use your own dataset, make sure you have a data list containing the image path and label as follows:

dataset/train/cat.1.jpg 0
dataset/train/dog.2.jpg 1

​ The script ./dataset/create_dataset.py can help you create such list and split data into train and val by randomly sampling or K-fold

./create_dataset.py \
--data_split_type k-fold \
--fold_num 10 \
--labelmap ./label_map.txt \
--data_dir ./dataset/train

Pretrained Model


​ We use ImageNet pretrained Inception-ResNet-v2 from Tensorflow official repository, you need to download the checkpoint file as long as the code. You need to put the checkpoint file under ./pretrained_model like:

./pretrained_model/inception_resnet_v2.ckpt

​ If you want to use other pretrained backbone models, you need to put the network defination code under ./nets and prepare the checkpoint

Train


​ After datasets and pretrained model are prepared, you can train the model just run

./train.py \
--train_dataset dataset/train.txt \
--val_dataset dataset/val.txt \
--train_dir experiments/expr1 \
--learning_rate 1e-4 \
--epoch 100 \
--batch_size 32 \
--image_size 224 \
--pretrained_model ./pretrained_model/inception_resnet_v2.ckpt

​ If you want to resume training, just add --resume to the command above

Evaluation


​ There are several tools provided to evaluate the model.

Evaluation using accuracy, precision and recall

You can use eval.py to evaluate your model using val data and calculate the accuracy, precision and recall, the command is:

./eval.py \
--val_dataset dataset/val.txt \
--train_dir experiments/expr1 \
--checkpoint model-10000 \
--batch_size 128

Prediction


If you want to generate submission for the Dogs vs Cats Competition, you can use test.py :

./test.py \
--test_dataset dataset/test.txt \
--train_dir experiments/expr1 \
--checkpoint model-10000 \
--batch_size 128

The items in dataset/test.txt should arrange as follows:

dataset/1.jpg
dataset/2.jpg

If you have several models, you can use simple ensembling technique to improve your performance:

# Assume your submissions are placed as: 
# experiments/expr1-1/submission.csv
# experiments/expr1-2/submission.csv
# experiments/expr1-3/submission.csv
# ...

./ensemble.py \
--fold_num 10 \
--ensembled_root experiments/expr1 \
--submission_name submission.csv

Tools


There are also some tools for K-fold cross validation and ensembling, which will be completed soon.

About

CNN-based Dog&Cat classification model as a course project for Introduction to Artificial Intelligence


Languages

Language:Jupyter Notebook 62.1%Language:Python 37.4%Language:Shell 0.5%