Code release

Question

Code release

NicoNico6 opened this issue 3 years ago · comments

Hi, haotong, it is nice to see such a great work.

I am trying to re-implement BIBert with the code provided in the ICLR 2022 supplementary materials. I used the suggested DynaBert as teahcer model, and I gained the same teacher accuracy (32-bit baseline acc). However, I could not gain the same performance on GLUE with this code. For example, the acc is only 0.13/0.08 for STS-B, 0.833 for SST-2, which is far lower than the acc reported in table 2.

May you help me with this issue? Does the provided code in supplementary materials is already sufficient for re-implementation? Or does the acc-mismatch problem come from wrong hyper-parameter setting (I did not modify the code and I use the suggested dybnabert as teacher and pretrained model)?

Many Thanks!

Haotong Qin · Answer 1 · Fri Mar 11 2022 12:03:55 GMT+0800 (China Standard Time)

Thanks for your attention to our BiBERT! The previous code in supplementary material is not our final version of the code. We apologize for causing you trouble. Now our formal code is released to GitHub and openreview (https://openreview.net/forum?id=5xEgrl_5FAJ). We have reproduced the BiBERT results using the formal code and pushed the well-trained model to google drive (https://drive.google.com/drive/folders/1xEEIynvsYuqqG6wRlMhSySUusZWoR1FL?usp=sharing). The results are updated in our camera-ready.