question about DSC dataset and bert_frozen ncl

Question

question about DSC dataset and bert_frozen ncl

drndr opened this issue 8 months ago · comments

Hi!
I have a question regarding the DSC full dataset. In the CTR paper it is said each domain has 2500-2500 positive and negative reviews for training, but the dataset itself (at least the 10 chosen domains as in the paper) have a max 2000-2000 samples, in some domains even less. Is this possible?

Also the two scenarios (dil_classification, til_classification) are not completely clear for me, if I understood correctly from the code, DIL doesnt use task-ids while TIL does. Which scenario should be used with the DSC dataset?

Finally, we have been able to reproduce some results on the DSC dataset, mostly within a couple of points compared to the table (in this repo or in the CTR paper), but BERT frozen NCL consistently produces 10-15% higher results. Currently we have an average accuracy of 0.8772 over 5 runs with different sequence seeds. Any idea why this naive approach would overperform the reported numbers?