How is "lilt-only-base" bin file is created

Question

How is "lilt-only-base" bin file is created

vibeeshan025 opened this issue 2 years ago · comments

Vibeeshan Mahadeva commented 2 years ago

Can you please provide us with more information regarding "lilt-only-base" file and how the model was created?

Since the base file is just 22MB in size, I would like to know what kind of dataset, parameters or logic used to create this.

I am trying to figure out what are the possibilities available, and what is the starting point I should read to get to know about creating such models. Please give me more references to read.

Jiapeng Wang · Answer 1 · Sat Oct 08 2022 10:14:47 GMT+0800 (China Standard Time)

Hi,
you can read our original paper at https://aclanthology.org/2022.acl-long.534/. As explained in it, LiLT-base+En-Roberta are pre-trained using English docs. And the provided "lilt-only-base" is exactly the pre-trained LiLT-base part. It can be used to combine different textual models to deal with docs in different languages during fine-tuning.

Vibeeshan Mahadeva · Answer 2 · Sat Oct 08 2022 12:40:07 GMT+0800 (China Standard Time)

Hi, you can read our original paper at https://aclanthology.org/2022.acl-long.534/. As explained in it, LiLT-base+En-Roberta are pre-trained using English docs. And the provided "lilt-only-base" is exactly the pre-trained LiLT-base part. It can be used to combine different textual models to deal with docs in different languages during fine-tuning.

I understand the usage. But I am very curious how the file "lilt-only-base" is created. As you have mentioned what is "pre-trained LiLT-base part" how that specific base part is created.
We all know how roberta-en is created and from your provided code how "gen_weight_roberta_like.py" generates the base + roberta model.

What does the base part contains.

Jiapeng Wang · Answer 3 · Sat Oct 08 2022 15:58:55 GMT+0800 (China Standard Time)

Pytorch uses a dict-like format to store weight name-value pairs in 'pytorch_model.bin' files. We just filter out the name-value pairs of the LiLT part by weight names from the pre-trained checkpoint to create "lilt-only-base".