Demonstrate image classification with paddlepaddle framework
PaddlePaddle dynamic graph is easy to use, and quite similar with pytorch. Meanwhile, Ai studio supplys free GPU hours(Tesla V100,Tesla V1004, and Tesla V1008) for developpers who interest in learn AI for fun, or need powerful GPUs for AI competitions. I try to learn it and prepare for some others task for fun in my spare time. This is an example of the workflow on doing image classification task with paddlepaddle, and code what I used to classify the class of a cat from image.
- paddle 1.8
- numpy
- matplotlib
- pandas
- cats dataset
-
install packages
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
or
pip install -r requirements.txt
-
download cats dataset
download and unzip file to code folder 链接:https://pan.baidu.com/s/1cE3AbX1UzDsbeD2pTbFSPw 提取码:i29d
python train_predict.py
- organize dataset in following format
├── dataset
└── YOUR_DATASET_NAME
├── train
├── xxx.jpg (name, format doesn't matter)
├── yyy.png
└── ...
├── test
├── zzz.jpg
├── www.png
└── ...
└── train_list.txt
-
format train_list.txt in following format
-
configure dataset
update variable 'DATASET' in dataset.py
DATASET = YOUR_DATASET_NAME
- Comparing with tensorflow, it is quite convinent to integrate three party python packages with paddle framework.
- Sometimes, cv2.imread() may not work, try PIL.Image instead.
- paddlex.cls.transforms can be used for image augmentation, but albumentation is better.
- pandas do a better job than numpy when saving data to file.
- Model, Optimizer, and batch_size should be tweaked when loss can not be optimized from the begining.
- Excessive augmentation could lead loss exposure in the begining, and bigger batch_size could help with this problem.
- Normalization image to [0,1], when relu activation is used.
- The purpose of Augmentation is to create more images with realistic varant but keep invarant features.