CasiaFan / SSD_EfficientNet

SSD using TensorFlow object detection API with EfficientNet backbone

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

got min_feature_level error when run train the model

PythonImageDeveloper opened this issue · comments

Hi @CasiaFan @wulungching,
My tensorflow version is 1.14.0 binary.
1 - I modify the model.builder.py to :

SSD_FEATURE_EXTRACTOR_CLASS_MAP = {
    'ssd_efficientnet': SSDEfficientNetFeatureExtractor,
    'ssd_efficientnet_fpn': SSDEfficientNetFPNFeatureExtractor,
......

2 - Put efficientnet.py and efficient_feature_extractor.py under object_detection/models directory
3- replace your ssd.proto with orginal ssd.proto
4.1 - when I run protoc object_detection/protos/ssd.proto --python_out=. I got this output:

object_detection/protos/ssd.proto:164:3: Expected "required", "optional", or "repeated".
object_detection/protos/ssd.proto:164:12: Expected field name.

4.2 - when I run ./bin/protoc object_detection/protos/*.proto --python_out=. ,It's OK and I didn't got anythings.

5- I modify original ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config to accordingly to your modification above. https://github.com/CasiaFan/SSD_EfficientNet/issues/2
6- when I run python3 model_main.py –alsologtostderr, I got this error:

    self._MergeField(tokenizer, sub_message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 730, in _MergeField
    (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 12:7 : Message type "object_detection.protos.SsdFeatureExtractor" has no field named "network_version".

then I comment the network_version in config file and run python3 model_main.py –alsologtostderr , I got this error :

  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 730, in _MergeField
    (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 13:7 : Message type "object_detection.protos.SsdFeatureExtractor" has no field named "min_feature_level".

when I run python object_detection/builders/model_builder_test.py for testing, I didn't got any error and that's Ok.

I think your protoc version is 2.6. You could follow this.

@PythonImageDeveloper My protoc version is 3.5.1. Note protoc 2 and protoc 3 differ a lot.

@wulungching @CasiaFan
I followed this command, in your opinion, I change the protoc version to 3.5.1?

Manual protobuf-compiler installation and usage**
If you are on linux:

**Download and install the 3.0 release of protoc, then unzip the file.

# From tensorflow/models/research/
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
unzip protobuf.zip
Run the compilation process again, but use the downloaded version of protoc

# From tensorflow/models/research/
./bin/protoc object_detection/protos/*.proto --python_out=.

@PythonImageDeveloper Yes, protoc 3.5.1could do it as well.

@CasiaFan
why you use TensorFlow 1.4? Is it possible with newer version?
and I must git clone Tensorflow models repository branch 1.4 or new master branch?
Is it possible to expand this configure for ssdlite_efficientnet?

@PythonImageDeveloper TF 1.4 is the latest stable version and I install it using pip command. Since the only difference between ssdlite and vanilla ssd is its prediction head, integration with ssdlite predictor configuration into config file should also be workable.

@CasiaFan
The latest stable version is 1.14.
Tensorflow Versions
If I use ssdlite_mobilnetv2_coco.config for ssdlite_efficientnet.config, It should be workable?
And in your opinion, Is it possible to use the pre-trained efficient-net for ssd_efficientnetconfig here? If so, How?
And in your opinion, Is well work with the protoc version 3.6.1?

@CasiaFan
My framework versions:
tensorflow: 1.14.0
cuda : 10.0
protoc : 3.5.1

I followed your commad and changes files, When I run python3 model_main.py –alsologtostderr in the object_detection directory, I got this error:

 File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/layers/convolutional.py", line 192, in build
   self.rank + 2))
 File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1050, in __init__
   filter_shape[num_spatial_dims]))
ValueError: number of input channels does not match corresponding dimension of filter, 8 != 32


@PythonImageDeveloper protoc 3.x version should meet our prerequisite. If you have efficientnet backbone pre-trained weights, modifying these lines in your configfile:

fine_tune_checkpoint: "/path/to/pretrained_ckpt"
from_detection_checkpoint: false

But if you have a complete detection pre-trained weights including the prediction head, then turn on from_detection_checkpoint to true

As for your issue, it seems to be related to inconsistency between feature input dimension and filter. Could you provide a more detailed log including traces indicating which line in our custom script produces it?

@CasiaFan

(.venv) mm@mm:~/API-TF2/models/research/object_detection$ python3 model_main.py –alsologtostderr
WARNING: Logging before flag parsing goes to stderr.
W0718 18:36:15.213086 140712742450944 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

W0718 18:36:15.229685 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/slim/nets/inception_resnet_v2.py:373: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

W0718 18:36:15.235659 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/slim/nets/mobilenet/mobilenet.py:397: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

W0718 18:36:15.245254 140712742450944 deprecation_wrapper.py:119] From model_main.py:116: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --model_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --pipeline_config_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
W0718 18:36:15.245936 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/utils/config_util.py:96: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

W0718 18:36:15.248813 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/model_lib.py:597: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

W0718 18:36:15.248920 140712742450944 model_lib.py:598] Forced number of epochs for all eval validations to be 1.
W0718 18:36:15.249017 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/utils/config_util.py:482: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

I0718 18:36:15.249068 140712742450944 config_util.py:482] Maybe overwriting use_bfloat16: False
I0718 18:36:15.249139 140712742450944 config_util.py:482] Maybe overwriting eval_num_epochs: 1
I0718 18:36:15.249200 140712742450944 config_util.py:482] Maybe overwriting sample_1_of_n_eval_examples: 1
I0718 18:36:15.249252 140712742450944 config_util.py:482] Maybe overwriting load_pretrained: True
I0718 18:36:15.249303 140712742450944 config_util.py:492] Ignoring config override key: load_pretrained
I0718 18:36:15.249358 140712742450944 config_util.py:482] Maybe overwriting train_steps: 200
W0718 18:36:15.249432 140712742450944 model_lib.py:614] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
I0718 18:36:15.249499 140712742450944 model_lib.py:649] create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0718 18:36:15.249855 140712742450944 estimator.py:209] Using config: {'_master': '', '_log_step_count_steps': 100, '_global_id_in_cluster': 0, '_experimental_distribute': None, '_device_fn': None, '_task_type': 'worker', '_is_chief': True, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_protocol': None, '_train_distribute': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_evaluation_master': '', '_save_checkpoints_steps': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_model_dir': './train_ssd_effecientnet', '_keep_checkpoint_max': 5, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7ff9c6d16f60>, '_tf_random_seed': None, '_experimental_max_worker_delay_secs': None, '_eval_distribute': None, '_task_id': 0, '_num_worker_replicas': 1, '_num_ps_replicas': 0}
W0718 18:36:15.250040 140712742450944 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7ff9c6d36f28>) includes params argument, but params are not passed to Estimator.
I0718 18:36:15.250505 140712742450944 estimator_training.py:186] Not using Distribute Coordinator.
I0718 18:36:15.250636 140712742450944 training.py:612] Running training and evaluation locally (non-distributed).
I0718 18:36:15.250807 140712742450944 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
W0718 18:36:15.255248 140712742450944 deprecation.py:323] From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0718 18:36:15.262588 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/data_decoders/tf_example_decoder.py:170: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

W0718 18:36:15.262764 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/data_decoders/tf_example_decoder.py:185: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

W0718 18:36:15.272913 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:61: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

W0718 18:36:15.273869 140712742450944 dataset_builder.py:66] num_readers has been reduced to 1 to match input file shards.
W0718 18:36:15.278006 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:80: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
W0718 18:36:15.278145 140712742450944 deprecation.py:323] From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/contrib/data/python/ops/interleave_ops.py:77: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0718 18:36:15.296915 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:149: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0718 18:36:15.435589 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/utils/ops.py:472: The name tf.is_nan is deprecated. Please use tf.math.is_nan instead.

W0718 18:36:15.436193 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/utils/ops.py:472: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0718 18:36:15.438794 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/utils/ops.py:474: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0718 18:36:15.472958 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/inputs.py:320: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0718 18:36:15.475012 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/core/preprocessor.py:512: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0718 18:36:15.516844 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/core/preprocessor.py:188: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0718 18:36:15.518922 140712742450944 deprecation.py:506] From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/util/dispatch.py:180: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version.
Instructions for updating:
Use the `axis` argument instead
W0718 18:36:16.135120 140712742450944 deprecation_wrapper.py:119] From /home/mm/API-TF/models/research/object_detection/core/preprocessor.py:2421: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

W0718 18:36:16.516294 140712742450944 deprecation.py:323] From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:152: batch_and_drop_remainder (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.batch(..., drop_remainder=True)`.
I0718 18:36:16.525623 140712742450944 estimator.py:1145] Calling model_fn.
I0718 18:36:16.634358 140712742450944 efficientnet.py:635] global_params= GlobalParams(batch_norm_momentum=0.99, batch_norm_epsilon=0.001, dropout_rate=0.2, data_format='channels_last', num_classes=1000, width_coefficient=1.0, depth_coefficient=1.0, depth_divisor=8, min_depth=None, drop_connect_rate=0.2)
I0718 18:36:16.634685 140712742450944 efficientnet.py:636] blocks_args= [BlockArgs(kernel_size=3, num_repeat=1, input_filters=32, output_filters=16, expand_ratio=1, id_skip=True, strides=[1, 1], se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=2, input_filters=16, output_filters=24, expand_ratio=6, id_skip=True, strides=[2, 2], se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=2, input_filters=24, output_filters=40, expand_ratio=6, id_skip=True, strides=[2, 2], se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=3, input_filters=40, output_filters=80, expand_ratio=6, id_skip=True, strides=[2, 2], se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=3, input_filters=80, output_filters=112, expand_ratio=6, id_skip=True, strides=[1, 1], se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=4, input_filters=112, output_filters=192, expand_ratio=6, id_skip=True, strides=[2, 2], se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=1, input_filters=192, output_filters=320, expand_ratio=6, id_skip=True, strides=[1, 1], se_ratio=0.25)]
I0718 18:36:16.636757 140712742450944 efficientnet.py:128] round_filter input=32 output=32
I0718 18:36:16.636873 140712742450944 efficientnet.py:128] round_filter input=16 output=16
W0718 18:36:16.637104 140712742450944 deprecation.py:506] From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
I0718 18:36:16.640686 140712742450944 efficientnet.py:128] round_filter input=16 output=16
I0718 18:36:16.640795 140712742450944 efficientnet.py:128] round_filter input=24 output=24
I0718 18:36:16.647423 140712742450944 efficientnet.py:128] round_filter input=24 output=24
I0718 18:36:16.647552 140712742450944 efficientnet.py:128] round_filter input=40 output=40
I0718 18:36:16.654196 140712742450944 efficientnet.py:128] round_filter input=40 output=40
I0718 18:36:16.654305 140712742450944 efficientnet.py:128] round_filter input=80 output=80
I0718 18:36:16.663950 140712742450944 efficientnet.py:128] round_filter input=80 output=80
I0718 18:36:16.664065 140712742450944 efficientnet.py:128] round_filter input=112 output=112
I0718 18:36:16.674213 140712742450944 efficientnet.py:128] round_filter input=112 output=112
I0718 18:36:16.674326 140712742450944 efficientnet.py:128] round_filter input=192 output=192
I0718 18:36:16.687278 140712742450944 efficientnet.py:128] round_filter input=192 output=192
I0718 18:36:16.687392 140712742450944 efficientnet.py:128] round_filter input=320 output=320
I0718 18:36:16.690702 140712742450944 efficientnet.py:128] round_filter input=32 output=32
I0718 18:36:16.692492 140712742450944 efficientnet.py:128] round_filter input=1280 output=1280
I0718 18:36:16.735306 140712742450944 efficientnet.py:475] Built stem layers with output shape: (?, 150, 150, 32)
I0718 18:36:16.735529 140712742450944 efficientnet.py:490] block_0 drop_connect_rate: 0.0
I0718 18:36:16.735614 140712742450944 efficientnet.py:272] Block input: None/efficientnet-b0/stem/lambda/swish_f32:0 shape: (?, 150, 150, 32)
I0718 18:36:16.735670 140712742450944 efficientnet.py:277] Expand: None/efficientnet-b0/stem/lambda/swish_f32:0 shape: (?, 150, 150, 32)
I0718 18:36:16.768238 140712742450944 efficientnet.py:280] DWConv: None/efficientnet-b0/blocks_0/lambda_1/swish_f32:0 shape: (?, 150, 150, 32)
Traceback (most recent call last):
  File "model_main.py", line 116, in <module>
    tf.app.run()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/mm/.venv/lib/python3.5/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/home/mm/.venv/lib/python3.5/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "model_main.py", line 112, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 473, in train_and_evaluate
    return executor.run()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 613, in run
    return self.run_local()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 714, in run_local
    saving_listeners=saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1146, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/mm/API-TF/models/research/object_detection/model_lib.py", line 288, in model_fn
    features[fields.InputDataFields.true_image_shape])
  File "/home/mm/API-TF/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 559, in predict
    feature_maps = self._feature_extractor(preprocessed_inputs)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in __call__
    self._maybe_build(inputs)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build
    self.build(input_shapes)
  File "/home/mm/API-TF/models/research/object_detection/builders/efficientnet_feature_extractor.py", line 278, in build
    model = build_model_base_keras_model(input_shape[1:], self._network_name, self._is_training)
  File "/home/mm/API-TF/models/research/object_detection/builders/efficientnet.py", line 735, in build_model_base_keras_model
    net = model.call_model(inputs, training=training, features_only=True)
  File "/home/mm/API-TF/models/research/object_detection/builders/efficientnet.py", line 491, in call_model
    outputs = block.call(outputs, training=training, output_layer_name='block_%s'%idx)
  File "/home/mm/API-TF/models/research/object_detection/builders/efficientnet.py", line 284, in call
    x = tf.keras.layers.Lambda(self._call_se)(x)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/engine/base_layer.py", line 634, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/layers/core.py", line 785, in call
    return self.function(inputs, **arguments)
  File "/home/mm/API-TF/models/research/object_detection/builders/efficientnet.py", line 256, in _call_se
    se_tensor = self._se_expand(relu_fn(self._se_reduce(se_tensor)))
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in __call__
    self._maybe_build(inputs)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build
    self.build(input_shapes)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/keras/layers/convolutional.py", line 192, in build
    self.rank + 2))
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1050, in __init__
    filter_shape[num_spatial_dims]))
ValueError: number of input channels does not match corresponding dimension of filter, 8 != 32

See #6 I think it's an internal bug from tf 1.14 where keras Conv2D operation has a wired performance on se expand operation in se block. You can roll back to tf 1.13.1 to fix this bug temporarily.

I roll back to tf 1.13.1, and now i got this error:

(.venv) mm@mm:~/API-TF2/models/research/object_detection$ python3 model_main.py –alsologtostderr

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --model_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --pipeline_config_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7fe4c9e3a6a8>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:80: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/utils/ops.py:472: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/inputs.py:320: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/core/preprocessor.py:188: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/core/preprocessor.py:1240: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version.
Instructions for updating:
Use the `axis` argument instead
WARNING:tensorflow:From /home/mm/API-TF/models/research/object_detection/builders/dataset_builder.py:152: batch_and_drop_remainder (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.batch(..., drop_remainder=True)`.
WARNING:tensorflow:From /home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/framework/function.py:1007: calling Graph.create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version.
Instructions for updating:
Shapes are always computed; don't use the compute_shapes as it has no effect.
Traceback (most recent call last):
  File "model_main.py", line 116, in <module>
    tf.app.run()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "model_main.py", line 112, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 611, in run
    return self.run_local()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/training.py", line 712, in run_local
    saving_listeners=saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/mm/API-TF/models/research/object_detection/model_lib.py", line 288, in model_fn
    features[fields.InputDataFields.true_image_shape])
  File "/home/mm/API-TF/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 578, in predict
    im_width=image_shape[2]))
  File "/home/mm/API-TF/models/research/object_detection/core/anchor_generator.py", line 100, in generate
    raise ValueError('Number of feature maps is expected to equal the length '
ValueError: Number of feature maps is expected to equal the length of `num_anchors_per_location`.

@PythonImageDeveloper It's caused by grammar discrepancy between python2 and python3 in ssd_meta_arch.py. Change the returned feature_maps.values() in efficientnet_feature_extractor to list type explicitly: return list(feature_maps.values())

@CasiaFan
I modified this part of in efficientnet_feature_extractor to below, but I got the same error:

    def _extract_features(self, preprocessed_inputs):
        """Extract features from preprocessed inputs"""        
        preprocessed_inputs = shape_utils.check_min_image_dim(33, preprocessed_inputs)
        image_features = self.net(ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple))
        layouts = {self._used_nodes[i]: image_features[i] for i, x in enumerate(self._used_nodes) if x}
        feature_maps = self._feature_map_generator(layouts)
        if self._additional_layer_depth:
            final_feature_map = []
            for idx, feature in enumerate(feature_maps.values()):
                feature = l.Conv2D(filters=self._additional_layer_depth,
                                    kernel_size=1,
                                    strides=[1, 1],
                                    use_bias=True,
                                    data_format=self._data_format,
                                    name='conv1x1_'+str(idx))(feature)
                feature = l.BatchNormalization()(feature, training=self._is_training)
                feature = l.ReLU(max_value=6)(feature)
                final_feature_map.append(feature)
            return final_feature_map
        else:
            return list(feature_maps.values())

@PythonImageDeveloper Do you make some change in the config file except for the training file path?

# SSD with Mobilenet v2 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_efficientnet'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 3
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 0.01
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 8
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  #fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
  #fine_tune_checkpoint_type:  "detection"
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/my_training.record"
  }
  label_map_path: "data/my.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/my_testing.record"
  }
  label_map_path: "data/my.pbtxt"
  shuffle: false
  num_readers: 1
}

@PythonImageDeveloper Uha, here it is. According to your configuration, it will generate anchors for 6 layers which means there should also be 6 layers of feature maps to correspond with. But feature extractor will use 5 layers in default (min level 3 to max level 7). To fix this problem, you could either change the num_layers to 5 or define min_feature_level and max_feature_level in feature_extractor section like:

feature_extractor {
...
min_feature_level: 3
max_feature_level: 8
}

@CasiaFan, Do you correctly run so far?
I now follow your new midification in github but I now got this error:

(.venv) mm@mm:~/API-TF2/models/research/object_detection$ python3 model_main.py –alsologtostderr

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --model_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
/home/mm/.venv/lib/python3.5/site-packages/absl/flags/_validators.py:358: UserWarning: Flag --pipeline_config_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  'command line!' % flag_name)
Traceback (most recent call last):
  File "model_main.py", line 116, in <module>
    tf.app.run()
  File "/home/mm/.venv/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "model_main.py", line 78, in main
    FLAGS.sample_1_of_n_eval_on_train_examples))
  File "/home/mm/API-TF/models/research/object_detection/model_lib.py", line 589, in create_estimator_and_inputs
    pipeline_config_path, config_override=config_override)
  File "/home/mm/API-TF/models/research/object_detection/utils/config_util.py", line 98, in get_configs_from_pipeline_file
    text_format.Merge(proto_str, pipeline_config)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 536, in Merge
    descriptor_pool=descriptor_pool)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 590, in MergeLines
    return parser.MergeLines(lines, message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 623, in MergeLines
    self._ParseOrMerge(lines, message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 638, in _ParseOrMerge
    self._MergeField(tokenizer, message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
    merger(tokenizer, message, field)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 837, in _MergeMessageField
    self._MergeField(tokenizer, sub_message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
    merger(tokenizer, message, field)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 837, in _MergeMessageField
    self._MergeField(tokenizer, sub_message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
    merger(tokenizer, message, field)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 837, in _MergeMessageField
    self._MergeField(tokenizer, sub_message)
  File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 730, in _MergeField
    (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 12:7 : Message type "object_detection.protos.SsdFeatureExtractor" has no field named "network_version".

When I comment the network_nersion in config file, I got this error:

File "/home/mm/.venv/lib/python3.5/site-packages/google/protobuf/text_format.py", line 730, in _MergeField
   (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 13:7 : Message type "object_detection.protos.SsdFeatureExtractor" has no field named "min_feature_level".

Hi @CasiaFan
I again replace your ssd.proto with orginal ssd.proto, and I run protoc object_detection/protos/ssd.proto --python_out=. , I got same error again:
google.protobuf.text_format.ParseError: 12:7 : Message type "object_detection.protos.SsdFeatureExtractor" has no field named "network_version".

my protoc --version is : libprotoc 3.5.1

Confirming it all works with the following changes:

  • tensorflow/tensorflow-gpu 1.13.1
  • protoc compilation of protobufffers with version >= 3.5.1 (I executed using 3.11.4 actually)
  • fixing default config you may source from this repo to max_feature_level: 7 in FPN settings
  • change _extract_features method in efficientnet_feature_extractor.py to return list(feature_maps.values())
  • reinstall object detection package after recompiling the protobuffers by
# From within TensorFlow/models/research/
pip install .
  • open new terminal session