arghosh / noisy_label_pretrain

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tabular data/ noisy instances/ new datasets

nazaretl opened this issue · comments

Hi,
thanks for sharing your implementation. I have some questions about it:

  1. Does it also work on tabular data?
  2. Is the code tailored to the datasets used in the paper or can one apply it to any data?
  3. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

  1. "Tabular data": Do you mean mix of categorical features and real-valued features? This repo works with image data with real-valued features. However, you can use standard datasets, tabular datasets by changing the input (e.g., one-hot encoding, embedding layers) and the layers (fully connected instead of convolutional).
  2. These layers are specifically for image dataset. You can change the neural model that works with your data, and it should work.
  3. You can predict based on the weighting network score. Use validation set to get the score threshold (for clean vs noisy) and the best epoch (to be used for returning the IDs). You need to change some codes such as add weighting module in test step also.

many thanks!