Confusezius / Revisiting_Deep_Metric_Learning_PyTorch

(ICML 2020) This repo contains code for our paper "Revisiting Training Strategies and Generalization Performance in Deep Metric Learning" (https://arxiv.org/abs/2002.08473) to facilitate consistent research in the field of Deep Metric Learning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

New metric: mAP@R

sudalvxin opened this issue · comments

Will you add the metric mAP@R( used in A Metric Learning Reality Check) into this repository?

Hi there!

This repo already contains a mAP@R implementation, however we do the mean over class-wise average precisions@k. I haven't checked, but I think their implementation does not do the class-wise averaging. If that is the case, I'll include this as well :)

The mAP used in this repo may be different with the mAP used in "A Metric Learning Reality Check". The latter considers the ranking of the correct retrievals.

https://web.stanford.edu/class/cs276/handouts/EvaluationNew-handout-1-per.pdf

Hi there!

This repo already contains a mAP@R implementation, however we do the mean over class-wise average precisions@k. I haven't checked, but I think their implementation does not do the class-wise averaging. If that is the case, I'll include this as well :)

https://web.stanford.edu/class/cs276/handouts/EvaluationNew-handout-1-per.pdf ========= this pdf may be useful.

Looking at the implementations, I think the main difference is the per-class averaging. I'll check their mAP variant experimentally and include it here in the next days :)

Looking at the implementations, I think the main difference is the per-class averaging. I'll check their mAP variant experimentally and include it here in the next days :)

Thanks for your contribution to DML.

Hi there!

This repo already contains a mAP@R implementation, however we do the mean over class-wise average precisions@k. I haven't checked, but I think their implementation does not do the class-wise averaging. If that is the case, I'll include this as well :)

The 'mAP@R' you mentioned is '.../metrics/mAP.py'. But, I think that mAP@R consider only the 'R' neighbors of each query.

What the mAP function does is for each class it takes as many samples into account for the average precision as there are samples available for the specific class (as we have also noted in the paper :)).

Looking over Kevin's implementation, this also seems to be what he is doing - and if you go to the supplementary of the paper where we also list mAP values for every run, the rough values also coincide with those listed in his paper :).

The main difference is that we clip the mAP at the maximum number of samples a class can have - I'll offer a second mAP option that does not have this property.

Ok so I have included the standard mAP-formulation as metrics/mAP.py and moved the class-limited current version to metrics/mAP_c.py. Both are heavily correlated and insights are transferable - either way, during default training, both will be tracked :) . Hope that helps!

Important note: I have included metrics/mAP_1000.py, which uses mAP@1000, as higher k values are not compatible with faiss-gpu (and also very costly for larger datasets s.a SOP).

Generally, there really is no reason to go beyond k=1000. Even for SOP, people only measure recall@1000, which already is quite debatable as a choice of metric :).

But if needed, you can include mAP in parameters.py/--evaluation_metrics.

Finally, I have included mAP_lim.py (which is also measured by default), which is mAP limited to k=1023 (so essentially mAP@1023 for all benchmark datasets). This is what is used in pytorch-metric-learning :).

Thank you very much!

I'll close this now, feel free to reopen it if any other question occurs :)