After fine-tuning SAM with my own data, the model becomes very insensitive to prompt inputs.

Question

After fine-tuning SAM with my own data, the model becomes very insensitive to prompt inputs.

klykq111 opened this issue 9 months ago · comments

After fine-tuning SAM with my own data, the model becomes very insensitive to prompt inputs. Different prompt inputs that are significantly different may result in similar mask outputs.

In my own data, there is only one or two specific objects that I need and trained on, and I have only annotated those desired objects. However, after fine-tuning the model, if I input a prompt that represents a background point or box far away from the object, the model's output is still roughly the mask of the object during fine-tuning, losing the ability to segment anything.

I would greatly appreciate any suggestions you may have.

klykq111 · Answer 1 · Fri Oct 20 2023 10:37:27 GMT+0800 (China Standard Time)

I think I have found the reason, which is overfitting of the model. When fine-tuning, I modified the number of epochs to 100, but the best results on the validation set are achieved at around 15 epochs. If I use a model fine-tuned for 100 epochs, it will result in the issues I mentioned earlier. However, using a model trained for around 15 epochs performs much better.