I have implemented semantic segmentation using Kitti Road dataset dataset.
I used the FCN architecture. I removed the dropout layer from the original FCN and added batchnorm to the encoder.
Make sure you have the following is installed:
- python 3.5
- tensorflow 1.2.1
- Etc.
I recommend that you create and use an anaconda env that is independent of your project. You can create anaconda env for this project by following these simple steps. This process has been verified on Windows 10 and ubuntu 16.04.
$ conda create -n seg python=3.5 anaconda=4.4.0
$ source activate seg # in windows "activate seg"
(seg) $ pip install tensorflow==1.2.1
Download the Kitti Road dataset from here. Extract the dataset in the data
folder. This will create the folder data_road
with all the training a test images.
- Download the vgg16 checkpoint file at https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models.
- Extract the downloaded file.
- Move the vgg_16.ckpt file to the project/root/data/vgg directory.
Train fully connected nework model for segmentation using train.py. The training parameter and the path to the model to be saved can be set with argparser.
You can use eval.py to evaluate the performance of the learned model. eval.py calculates the mean-iou for the road pixel and outputs it.
Pretrained fcn model is saved at fcn.zip
During training, loss and pixelwise classification accuracy were monitored using a tensorboard.
The figure below shows the ground truth road pixel and the pixel predicted by the trained model.
mean-iou
is the standard metric for segmentation purposes. It computes a ratio between the intersection and the union of two sets.
Using tf.metrics.mean_iou, the mean_iou of the trained model is evaluated to be 0.944.