[deeplab] what's the parameters of the mobilenetv3 pretrained model?
mmxuan18 opened this issue · comments
Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
What is the top-level directory of the model you are using
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
Bazel version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce
Sorry for the late response. the following commandline flags would need to be set for v3 models, in addition to model_variant
, which should be "mobilenet_v3_large_seg" for large v3 model and "mobilenet_v3_small_seg" for small.
--image_pooling_crop_size=769,769
--image_pooling_stride=4,5
--aspp_convs_filters=128
--aspp_with_concat_projection=0
--aspp_with_squeeze_and_excitation=1
--decoder_use_sum_merge=1
--decoder_filters=19
--decoder_output_is_logits=1
--image_se_uses_qsigmoid=1
--image_pyramid=1
--decoder_output_stride=8
I was able to train with these parameters (
python train.py --dataset cityscapes --dataset_dir ../../datasets/cityscapes/tfrecord --train_logdir checkpoints/2020-02-05-cityscapes-mobilenet_v3_small_seg --train_split train --model_variant mobilenet_v3_small_seg --train_crop_size=769,769 --training_number_of_steps=180000 --train_split=train --decoder_output_stride=8 --image_pyramid=1 --image_se_uses_qsigmoid=1 --decoder_output_is_logits=1 --decoder_filters=19 --decoder_use_sum_merge=1 --aspp_with_squeeze_and_excitation=1 --aspp_with_concat_projection=0 --aspp_convs_filters=128 --image_pooling_stride=4,5 --image_pooling_crop_size=769,769
)
but I cannot evaluate the trained model using
python eval.py --checkpoint_dir checkpoints/2020-02-05-cityscapes-mobilenet_v3_small_seg --dataset cityscapes --dataset_dir ../../datasets/cityscapes/tfrecord --model_variant mobilenet_v3_small_seg --decoder_output_stride=8 --image_pyramid=1 --image_se_uses_qsigmoid=1 --decoder_output_is_logits=1 --decoder_filters=19 --decoder_use_sum_merge=1 --aspp_with_squeeze_and_excitation=1 --aspp_with_concat_projection=0 --aspp_convs_filters=128 --image_pooling_stride=4,5 --batch_size 1 --output_stride=8 --eval_crop_size="769,769"
, I get the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: padded_shape[0]=137 is not divisible by block_shape[0]=2
[[node MobilenetV3/expanded_conv_4/depthwise/depthwise/SpaceToBatchND (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
edit:
seems to be solved using some of the dataset flags from train.py missing in eval.py: min_scale_factor, max_scale_factor, scale_factor_step_size.
better yet set eval_crop_size to 2049,1025 as per https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/faq.md
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.