htqin / BiBERT

This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Code release

NicoNico6 opened this issue · comments

Hi, haotong, it is nice to see such a great work.

I am trying to re-implement BIBert with the code provided in the ICLR 2022 supplementary materials. I used the suggested DynaBert as teahcer model, and I gained the same teacher accuracy (32-bit baseline acc). However, I could not gain the same performance on GLUE with this code. For example, the acc is only 0.13/0.08 for STS-B, 0.833 for SST-2, which is far lower than the acc reported in table 2.

May you help me with this issue? Does the provided code in supplementary materials is already sufficient for re-implementation? Or does the acc-mismatch problem come from wrong hyper-parameter setting (I did not modify the code and I use the suggested dybnabert as teacher and pretrained model)?

Many Thanks!

Thanks for your attention to our BiBERT! The previous code in supplementary material is not our final version of the code. We apologize for causing you trouble. Now our formal code is released to GitHub and openreview (https://openreview.net/forum?id=5xEgrl_5FAJ). We have reproduced the BiBERT results using the formal code and pushed the well-trained model to google drive (https://drive.google.com/drive/folders/1xEEIynvsYuqqG6wRlMhSySUusZWoR1FL?usp=sharing). The results are updated in our camera-ready.