CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BERT Pretraining

bonham79 opened this issue · comments

Do we have any interest in adding in a masking function to Yoyodyne to allow BERT style training with the available models? This could feasibly improve inflection/g2p performance by allowing pretraining. Also allows use of the library for LM-esque training.

Will remove the suggestion if goes against underlying purpose of library. Just thought I'd ask since related to a potential side project.

Not opposed but not on the roadmap, let's say. I don't have any particular need for this (and if I did I would probably use something already pretrained myself) but if you work the whole thing out we should definitely document and release it. I'll mark this closed for now (without prejudice) but you can reopen when/if you have PRs for this.

commented

Just wanted to comment that I have a sort of sloppy version of this implemented in an old version of yoyodyne for something I was trying.

Agree its not in the roadmap, but I have some experiments I want to run this summer so maybe we can talk then :).

@Adamits would it happen to involve this paper at all? https://arxiv.org/pdf/2201.10716.pdf

commented

@bonham79 No, but thanks for the link I should read that! I have a preliminary result for something + a masters student running some further experiments + a proposal I want to work on this summer. We can discuss more in email if you are interested, or have something similar in mind?

@Adamits Sure! you can shoot me at travismbartley@gmail.com and we can discuss off thread.