hq-jiang / instance-segmentation-with-discriminative-loss-tensorflow

Tensorflow implementation of "Semantic Instance Segmentation with a Discriminative Loss Function"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pretrained model for semantic segmentation

tmquan opened this issue · comments

commented

Hi @hq-jiang,

Nice work for discriminative loss.
I am trying to use your code for other data such as cityscapes or CVPPP (leaf segmentation) as discussed in the original paper. I would like to ask whether you can release code model for semantic segmentation as well? I would like to train from scratch in those data and therefore a separate training for semantic mask is required.

Bests,

commented

Hi @hq-jiang
I would like to revisit this issue. I have successfully loaded your pretrained model on semantic segmentation to train further the discriminative loss. However, when I visualize them on tensorboard it is quite messy and I am totally lost.

May I ask that where the semantic prediction is placed in the Enet? Before or after the high-dimensional-feature (prediction) in your implementation.
In other words, what are the last_prelu and prediction standing for? Are they corresponding to semantic segmentation (12 classes) and high dimensional feature (3 in your case)?

For the clustering, is it necessary to mask out the high-dimensional feature using semantic prediction before passing to mean-shift clustering algorithm, which has only one parameter bandwidth that can be tuned later?

In your opinion, is it a good way for us to train semantic segmentation simultaneously with high-dimensional prediction by the discriminative loss?

Thanks,

Hi @tmquan,

I have uploaded my raw semantic segmentation code for reference. You might need to change things to make it work, I am a little bit busy right now and I would need to set up a new AWS instance to test it.

Regarding your questions:

  1. The last_prelu is the name defined by enet.py. It refers to the output of bottlenet5.1 (see image).
    prediction which has the name scope Instance/transfer_layer/conv2d_transpose is the replacement of fullconv (see image). So you are right last_prelu should have the dimension of 12 in your case and 3 high dimensional features. Attention: My implementation works for a binary class problem (lane marking or lane marking). For a multi-class multi instance problem you might need to adapt my implementation:

In contrast to the CVPPP dataset, Cityscapes is a multi-class instance segmentation challenge. Therefore, we run our loss function independently on every semantic class, so that instances belonging to the same class are far apart in feature space, whereas instances from different classes can occupy the same space. For example, the cluster centers of a pedestrian and a car that appear in the same image are not pushed away from each other.
Semantic Instance Segmentation with a Discriminative Loss Function

enet

  1. The bandwidth parameter can be tuned once you finished training.

  2. The authors in Fast Scene Understanding for Autonomous Driving trained 3 networks simultaneously and found that it improves overall accuracy.

commented

So sorry for bothering you a lot.

May I recap a very last question: For the clustering, is it necessary to mask out the high-dimensional feature using semantic prediction before passing to mean-shift clustering algorithm?

Thank you very much.

Ah, sorry I missed that part. Yes, you are right, you need to seed your instances with semantic segmentation. In my case that was not necessary, because the biggest instance is always the background