BERT Pretraining

Question

BERT Pretraining

bonham79 opened this issue a year ago · comments

Do we have any interest in adding in a masking function to Yoyodyne to allow BERT style training with the available models? This could feasibly improve inflection/g2p performance by allowing pretraining. Also allows use of the library for LM-esque training.

Will remove the suggestion if goes against underlying purpose of library. Just thought I'd ask since related to a potential side project.

Kyle Gorman · Answer 1 · Thu Apr 13 2023 15:23:58 GMT+0800 (China Standard Time)

Not opposed but not on the roadmap, let's say. I don't have any particular need for this (and if I did I would probably use something already pretrained myself) but if you work the whole thing out we should definitely document and release it. I'll mark this closed for now (without prejudice) but you can reopen when/if you have PRs for this.

Adam · Answer 2 · Thu Apr 13 2023 23:08:07 GMT+0800 (China Standard Time)

Just wanted to comment that I have a sort of sloppy version of this implemented in an old version of yoyodyne for something I was trying.

Agree its not in the roadmap, but I have some experiments I want to run this summer so maybe we can talk then :).

Travis Bartley · Answer 3 · Thu Apr 13 2023 23:19:24 GMT+0800 (China Standard Time)

@Adamits would it happen to involve this paper at all? https://arxiv.org/pdf/2201.10716.pdf

Adam · Answer 4 · Thu Apr 13 2023 23:39:35 GMT+0800 (China Standard Time)

@bonham79 No, but thanks for the link I should read that! I have a preliminary result for something + a masters student running some further experiments + a proposal I want to work on this summer. We can discuss more in email if you are interested, or have something similar in mind?

Travis Bartley · Answer 5 · Thu Apr 13 2023 23:42:31 GMT+0800 (China Standard Time)

@Adamits Sure! you can shoot me at travismbartley@gmail.com and we can discuss off thread.