allenai / dont-stop-pretraining

Code associated with the Don't Stop Pretraining ACL 2020 paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does DAPT lead to forgetting over the original LM domain or overfitting over the target domain?

dr-GitHub-account opened this issue · comments

commented

Further DAPT was implemented on each domain for 12.5K steps with unlabeled data from target domain only. I am wondering whether not adding unlabeled data from original LM domain leads to detrimental forgetting or overfitting.