Training semantic segmentation on greyscale images

Question

Training semantic segmentation on greyscale images

ChristianEschen opened this issue 8 years ago · comments

I am using the format to train the network:
python issegm/voc.py --gpus 1 --split train --data-root ${New_database} --output output --model ${New_database}_rna-a1_cls${Number_of_classes} --batch-images 4 --crop-size 500 --origin-size 512 --scale-rate-range 0.7,1.3 --weights models/ilsvrc-cls_rna-a1_cls1000_ep-0001.params --lr-type fixed --base-lr 0.0016 --to-epoch 140 --kvstore local --prefetch-threads 4 --prefetcher thread --backward-do-mirror

I have also edited the split files: prepare split files and save them into issegm/data/${New_database};

I have also edited in vog.py:

elif dataset == 'New_database':
num_classes = model_specs.get('classes', 2)
valid_labels = range(num_classes)
#
max_shape = np.array((512, 512))

Unfortunately, I receive the following error code:
issegm/voc.py:356: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 32 but corresponding boolean dimension is 512
pred_label = pred.argmax(1).ravel()[valid_flag]
Traceback (most recent call last):
File "issegm/voc.py", line 729, in
_train_impl(args, model_specs, logger)
File "issegm/voc.py", line 482, in _train_impl
num_epoch=args.stop_epoch,
File "/home/s123656/mxnet/python/mxnet/module/base_module.py", line 412, in fit
self.update_metric(eval_metric, data_batch.label)
File "/home/s123656/mxnet/python/mxnet/module/module.py", line 556, in update_metric
self._exec_group.update_metric(eval_metric, labels)
File "/home/s123656/mxnet/python/mxnet/module/executor_group.py", line 470, in update_metric
eval_metric.update(labels_slice, texec.outputs)
File "/home/s123656/mxnet/python/mxnet/metric.py", line 395, in update
reval = self._feval(label, pred)
File "issegm/voc.py", line 356, in _eval_func
pred_label = pred.argmax(1).ravel()[valid_flag]
IndexError: index 33 is out of bounds for axis 1 with size 32

Does this error have to do with the fact that I use greyscale images?
My images only contains binary label's : {1,2} respectively
How do you define the label images?

ChristianEschen · Answer 1 · Tue Feb 14 2017 22:56:36 GMT+0800 (China Standard Time)

As a comment; I have tried to replicate my input images to (512,512,3), however, this gives a dimension mismatch error.: "ValueError: could not broadcast input array from shape (512,512,3) into shape (512,512)"

Zifeng Wu · Answer 2 · Wed Feb 15 2017 00:20:06 GMT+0800 (China Standard Time)

so in label files, there are only 1s and 2s? In this case, let valid_labels = [1,2]

For the error, it seems like the gt label is longer than the predicted label. Have you change feat_stride/label_stride?

ChristianEschen · Answer 3 · Wed Feb 15 2017 01:09:39 GMT+0800 (China Standard Time)

Still no success. Images are gray scale 512x512 and unit16. Both input images and label images. Is the code compatible with grayscale or only rgb? I have not chanced feat_stride or label_stride

ChristianEschen · Answer 4 · Wed Feb 15 2017 03:28:11 GMT+0800 (China Standard Time)

I found out that the label-images wer scaled after preprocessing in matlab to [0,255]
By changing the numclasses to 3 instead of 2 I manage to train the netowrk:

elif dataset == 'New_database':
    num_classes = model_specs.get('classes', 3)
    valid_labels = [0,255]#range(num_classes)

Furthermore, I changed --model ${New_database}_rna-a1_cls${Number_of_classes}
to 3 number of classes instead of 2.
Now the network responds with validation error equal 0.999.
I suspect that something is wrong still.

Zifeng Wu · Answer 5 · Wed Feb 15 2017 13:43:40 GMT+0800 (China Standard Time)

As far as I can concern, the code is ok for grayscaled images as input data.

The ground-truth label images should contain integers in [0, 1, ..., 255].
But do not use 255 for your target categories being predicted, because 255 denotes ignored pixels.

Besides, it doesn't make sense to let the network predict three classes given there are only two kinds of pixels.

ChristianEschen · Answer 6 · Fri Feb 17 2017 01:38:01 GMT+0800 (China Standard Time)

"Besides, it doesn't make sense to let the network predict three classes given there are only two kinds of pixels."
I know. I just noticed that I did not experience any error using 3 classes instead of 2. Maybe the code does not accept 2 classes only?
Still no succes with only 2 input labels.

I got this error message:
pred_label = pred.argmax(1).ravel()[valid_flag]
Traceback (most recent call last):
File "issegm/voc.py", line 732, in
_train_impl(args, model_specs, logger)
File "issegm/voc.py", line 485, in _train_impl
num_epoch=args.stop_epoch,
File "/home/s123656/mxnet/python/mxnet/module/base_module.py", line 412, in fit
self.update_metric(eval_metric, data_batch.label)
File "/home/s123656/mxnet/python/mxnet/module/module.py", line 556, in update_metric
self._exec_group.update_metric(eval_metric, labels)
File "/home/s123656/mxnet/python/mxnet/module/executor_group.py", line 470, in update_metric
eval_metric.update(labels_slice, texec.outputs)
File "/home/s123656/mxnet/python/mxnet/metric.py", line 395, in update
reval = self._feval(label, pred)
File "issegm/voc.py", line 359, in _eval_func
pred_label = pred.argmax(1).ravel()[valid_flag]
IndexError: index 130 is out of bounds for axis 1 with size 130

Zifeng Wu · Answer 7 · Tue Feb 21 2017 12:17:47 GMT+0800 (China Standard Time)

@ChristianEschen Exactly, the current custom metric function does not work for two-way classification.

ChristianEschen · Answer 8 · Wed Feb 22 2017 14:09:05 GMT+0800 (China Standard Time)

Thank you for the response.
I have now introduced a synthetic "ignore label" category with pixel intensity=255, which I know a priory can be ignored. Now I can train the network.
I train the segmentation network from scratch with no weights initialized and learning rate 0.0016.
However, the network almost immediately returns: train-fcn_valid=0.999167
Is this the train accuracy?
If so I guess that the network overfits?

ChristianEschen · Answer 9 · Tue Mar 07 2017 18:29:19 GMT+0800 (China Standard Time)

What should the sum metric function look like for two-wat classification?