Using a Non-Local Block significantly reduces accuracy.

Question

Using a Non-Local Block significantly reduces accuracy.

lin0287 opened this issue a year ago · comments

Below is my code:

ResNet50NLBase = NonLocalResNet(
      input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
      classes=6,
      include_top=False,
      repetitions=[3, 4, 6, 3]
  )
  ResNet50NL = Model(inputs=ResNet50NLBase.inputs, outputs=ResNet50NLBase.layers[-2].output)

  #model.add(RN50)
  model.add(ResNet50NL)
  model.add(GlobalAveragePooling2D())
  model.add(Dense(6, activation='softmax'))

  model.compile(
      loss='categorical_crossentropy',
      optimizer=Adam(),
      metrics=['categorical_accuracy',
               tfa.metrics.CohenKappa(num_classes=6, sparse_labels=False, weightage="quadratic")]
  )

This has significantly reduced accuracy compared to:

RN50 = tf.keras.applications.resnet50.ResNet50(
    input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
    weights='imagenet',
    include_top=False,
    pooling='avg'
)
RN50 = Model(inputs=RN50.inputs, outputs=RN50.layers[-2].output)```
Can you explain this?

Andrew Garcia · Answer 1 · Sat Jul 08 2023 04:13:58 GMT+0800 (China Standard Time)

@lin0287 In my fork of this repo (may merge soon), I added a NonLocalBlock layer in non_local_layerstyle.py that can be used to add the NonLocalBlock directly to model.add() (see README). If you tinker with it, this could help clear the current issue as well as issue #17