ShirAmir / dino-vit-features

Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".

Home Page:https://dino-vit-features.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parameter tunning for custom dataset

HHenryD opened this issue · comments

I found your method sensitive to the choice of parameters (thresh, elbow coefficient, etc.). Instead of tunning them manually and assessing the results qualitatively, is there a way to do a grid search and assess quantitatively? For example, can I search on the training set and evaluate on the validation set and use Landmark regression results to select the best parameters? If so, could you upload your evaluation scripts so that I can do it this way? Thank you.

Hi Henry,
Our methods operate in a "zero-shot" manner on a single set, hence there is no training and evaluation set - only the input set.
We found that for large enough sets (e.g. more than 20 images) our methods are stable with the default hyperparameters, via a grid search and manual assessment.
If you wish to operate on smaller sets (e.g. a pair of images), you can either manually tweak the hyperparameters and assess the quality manually or apply many random crop augmentations to the input images (say 20-25 per image) which should suffice.
We also provide utility visualizations such as the saliency maps and clustering result while running results, which can aid tuning the parameters.

Let me know if you have more questions.

Hi Shir,
Thanks for your reply.
I see your points. I also wonder how did you find the current default hyperparameters? Did you find them by manually trying different combinations and assessing the quality?

Exactly. I started by manually choosing an elbow coefficient value that yielded clusters that were visually pleasing. Choosing the saliency threshold was quite straight-forward because there was a large margin between the saliency of background and foreground objects (in other words, many different values can suffice for good performance). Another way to tune the elbow coefficient is to plot the k-means cost function using several different k values, plot them in a graph and choose the point where the cost function stops descending rapidly and descends slower.

Thank you!