dandelin / ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fine-tuning ViLT for MLM task with a new dataset

Ellyuca opened this issue · comments

Hi. Thanks for providing the code to such a great work. I am new to language models and I apologize for maybe asking trivial questions.

I am wondering if it is possible to fine-tune the model for MLM on a new/different dataset.
Basically I want to have a model that can predict the [MASK] specific to a certain dataset (with custom text and images).
Could you please share how to do this?

Thanks in advance for your time and help.
Best regards.