The some "Evaluation Metric" for seen classes and novel classes.

Question

The some "Evaluation Metric" for seen classes and novel classes.

jingzhengli opened this issue 2 years ago · comments

Hi, thanks for sharing the great work!
I found that the same Evaluation Metric, i.e., the Hungarian algorithm, is used for seen classes and novel classes. In fact, I think that the standard "classification accuracy" should be used for seen classes, to avoid the mis-matching issue on seen classes.

Bingchen Zhao · Answer 1 · Tue Jul 26 2022 21:33:43 GMT+0800 (China Standard Time)

I think the use of the Hungarian algorithm here is because the method does not have a parametric classifier for both seen and novel classes, so the standard 'classification accuracy' is not applicable for both seen and novel classes.

Jingzheng Li · Answer 2 · Tue Jul 26 2022 22:15:46 GMT+0800 (China Standard Time)

I think the use of the Hungarian algorithm here is because the method does not have a parametric classifier for both seen and novel classes, so the standard 'classification accuracy' is not applicable for both seen and novel classes.

Thanks for your reply. I agree with you in this case.
I provide a way to measure the performance of a seen category using standard classification accuracy, similar to a prototype network: maybe we can directly calculate the classification accuracy of unlabelled instances within the cluster to which the labeled seen category belongs.

Jingzheng Li · Answer 3 · Tue Jul 26 2022 22:23:34 GMT+0800 (China Standard Time)

I think the use of the Hungarian algorithm here is because the method does not have a parametric classifier for both seen and novel classes, so the standard 'classification accuracy' is not applicable for both seen and novel classes.

If seen class uses Hungarian algorithm as metric, we cannot compare it with other baselines based on standard 'classification accuracy'.(not fair)

sgvaze · Answer 4 · Tue Aug 09 2022 22:25:43 GMT+0800 (China Standard Time)

Hi both, apologies for the delay in pitching in on this!

It is indeed an interesting point. 'Classification Accuracy' could be seen as 'Old ACC' if we do not perform the Hungarian Assignment over these classes and instead force the ground truth 'cluster_index --> class_index' mapping for the classes for which we have labels. In this way, Classification Accuracy is an upper bound to Old ACC (though note that assuming the GT mappings for the Old classes necessarily reduces ACC on the New categories).

We considered this discrepancy, but for now decided to do the Hungarian algorithm over ALL categories, as it allows for a 1-1 comparison with the fully unsupervised methods such as k-means clustering. In future works we may explore other options!

Jingzheng Li · Answer 5 · Wed Aug 10 2022 18:08:36 GMT+0800 (China Standard Time)

Hi both, apologies for the delay in pitching in on this!

It is indeed an interesting point. 'Classification Accuracy' could be seen as 'Old ACC' if we do not perform the Hungarian Assignment over these classes and instead force the ground truth 'cluster_index --> class_index' mapping for the classes for which we have labels. In this way, Classification Accuracy is an upper bound to Old ACC (though note that assuming the GT mappings for the Old classes necessarily reduces ACC on the New categories).

We considered this discrepancy, but for now decided to do the Hungarian algorithm over ALL categories, as it allows for a 1-1 comparison with the fully unsupervised methods such as k-means clustering. In future works we may explore other options!

Thank you very much for your reply

Jingzheng Li · Answer 6 · Thu Aug 11 2022 23:47:37 GMT+0800 (China Standard Time)

Hi both, apologies for the delay in pitching in on this!

It is indeed an interesting point. 'Classification Accuracy' could be seen as 'Old ACC' if we do not perform the Hungarian Assignment over these classes and instead force the ground truth 'cluster_index --> class_index' mapping for the classes for which we have labels. In this way, Classification Accuracy is an upper bound to Old ACC (though note that assuming the GT mappings for the Old classes necessarily reduces ACC on the New categories).

We considered this discrepancy, but for now decided to do the Hungarian algorithm over ALL categories, as it allows for a 1-1 comparison with the fully unsupervised methods such as k-means clustering. In future works we may explore other options!

Hi, Using clustering to obtain pseudo-labels in "Contrastive adaptation network for unsupervised domain adaptation'' solved this problem.