LiJunnan1992 / MLNT

Meta-Learning based Noise-Tolerant Training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The true baseline should be Iterative training without Meta-learning?

XinshaoAmosWang opened this issue · comments

Dear authors, your ideas are interesting and novel:

  1. Oracle/Mentor (Consistency loss): To make meta-test reliable, the teacher/mentor model should be reliable and robust to real noisy examples. Therefore, they apply iterative training and iterative data cleaning to make the meta-test consistency loss reliable and an optimisation oracle against real noise. (I suppose this should be the true baseline.)
  2. Unaffected by synthetic noise: The meta-training sees synthetic noisy training examples. After training on them, the meta-testing evaluates its consistency with oracle and aims to maximise the consistency, i.e., making it unaffected after seeing synthetic noise. (I suppose this is the key meta-learning proposal.)

In this case, the baseline should be Iterative training without Meta-learning. That is without meta-learning on synthetic noisy examples.
It is more interesting to see how much exactly meta-learning proposal improves the performance versus the true baseline.

Could you please share something about this? Thanks so much.

Hi Xinshao,
Thanks for your valuable comments! In our paper we show the improvement of meta-learning over baseline CE (reproduced) by comparing the result after 1st iteration. We think that the improvement is significant enough. We could also perform iterative training on CE, but we expect a similar improvement from meta-learning and hence did not do it.

Thanks for your kind reponse. :-) :-)

In the current framework, according to my understanding, the iterative training on filtering training data and meta-learning on syntheic noisy data is combined. And as far as I know, the training data filtering is necessary since it can increase the reliability of the mentor/oracle in the 2nd and 3rd training rounds.

With only iterative training data filtering/cleaning, it is interesting to see how much the mentor/network improves in each iterative round. We can see how much iterative training improves the performance with conventional CE.

Then adding meta-learning on synthetic noisy data, we can see how much meta-learning component improves the performance on top of conventional CE.

In the 1st round, the training data is noisy, so I suppose the teacher model is not reliable enough. Oh..., yes, relatively more reliable than the student model.

I am interested in this kind of ablation studies very much. How do you think of it? Or does it make sense to you?

Thanks in advance. :-) :-)

It is a good idea indeed and having such results could give more insights to the method.