Out of 737 teams, our solution for Carvana Image Masking Challenge on Kaggle ranked 9th place (top 1.2%) on Public Leaderboard and 31th place (top 4.2%) on Private Leaderboard. It is made by Chia-Hao Hsieh and Shao-Wen Lai.
In this competition, you’re challenged to develop an algorithm that automatically removes the photo studio background.
There are 100064 test images.
This competition is evaluated on the mean Dice coefficient. The Dice coefficient can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by:
2 * |X ∩ Y| / (|X|+|Y|)
whereX
is the predicted set of pixels andY
is the ground truth. The Dice coefficient is defined to be 1 when bothX
andY
are empty. The leaderboard score is the mean of the Dice coefficients for each image in the test set.
Our solution is an ensemble of 5 modified U-Net models using 1280x1280 image patch as input, along with test time augmentation. We used a combination loss function of soft DICE loss and Binary Cross Entropy loss. During training, we used data augmentations, including flipping, shifting, scaling, HSV color augmentation, and fancy PCA.
Training of one single model takes about 60-80 hours on a single GPU P5000 machine. Testing takes about 6-8 hours.
Our best ensembled model scored 0.997191 mean Dice coefficient on Private Leaderboard and 0.996899 on Public Leaderboard.
Here are some results by our best single model:
- python 3.6
- numpy
- pytorch
- pandas
- pyyaml
- crayon
- scikit-image
- pydensecrf
pip install cython
and thenpip install pydensecrf
-
Extract data downloaded from Kaggle to
./data
:data ├── metadata.csv ├── sample_submission.csv ├── test_hq │ ├── 0004d4463b50_01.jpg │ ├── 0004d4463b50_02.jpg ... │ └── 846faa0eb79f_04.jpg ├── train_hq │ ├── 00087a6bd4dc_01.jpg │ ├── 00087a6bd4dc_02.jpg ... │ └── fff9b3a5373f_16.jpg ├── train_masks │ ├── 00087a6bd4dc_01_mask.gif │ ├── 00087a6bd4dc_02_mask.gif ... │ └── fff9b3a5373f_16_mask.gif └── train_masks.csv
Images are all of size 1918 x 1280
-
Before training, start
crayon
by runningdocker run -d -p 8888:8888 -p 8889:8889 --name crayon alband/crayon
-
Run
python train.py
-
Run
python test.py <experiment_name>
For example, run
python test.py PeterUnet3_dropout
⚠️ Before you runtest.py
the first time, make sure you have at least250GB
free disk space to save prediction results. -
[Optional] Run
python run_ensemble.py --pred_dirs <exp_output_dir_1> <exp_output_dir_2> ... <exp_output_dir_n>
For example, run
python run_ensemble.py --pred_dirs 0921-05:59:53 0921-06:00:00 0921-06:00:05
to ensemble three predictions -
Run
python run_rle.py <exp_output_dir>
to generate submission at./output/<exp_output_dir>/submission.csv
-
[Optional] Run
python run_rle_ensemble.py --pred_dirs <exp_output_dir_1> <exp_output_dir_2> ... <exp_output_dir_n>
to ensemble run-length encoded submission.csv files.For example, run
python run_rle_ensemble.py --pred_dirs 0923-05:59:53 0921-06:00:00
to ensemble two predictions
-
To find numbers that are divislbe by
2^n
, runpython scripts/divisble.py <start_number> <end_number>
For instance,
python scripts/divisble.py 900 1300
- load data
- train/val split
- prepare small dataset
- implement run length encoding
- add experiment config loading
- try a simple UNet
- visualize groundtruth and prediction
- add DICE validation
- add data augmentation: random horizontal flipping
- add data augmentation: padding
- try the original UNet
- train/predict-by-tile
- add DICE score during training
- add validation loss and DICE score during training
- add optimizer and loss to experiment setting in .yml
- try modified UNet with UpSampling layers
- improve tile.py: make it able to cut image into halves
- complete util/tile.py: stitch_predictions()
- complete util/submit.py
- add data augmentation: random shift
- add boundary weighted loss
- experimenting with UNet parameters and architectures/modules
- add CRF
- it didn't help