tabular data/ noisy instances
nazaretl opened this issue · comments
Hi,
thanks for sharing your implementation. I have two questions about it:
- Does it also work on tabular data?
- Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?
Thanks!
Hi @nazaretl,
-
Yes, it is applicable to tabular data, but I guess you would need to change the network architecture.
-
Yes, in the paper we demonstrated one way of identifying noisy examples -- you need to rank the examples by the norm of the difference between predicted and actual gradients. Please see the
examine_model
function in https://github.com/hrayrhar/limit-label-memorization/blob/master/notebooks/visualize-results.ipynb.
Hrayr
many thanks for the explanation!
Lusiné