Question on Fine-Tuning ESM-2 Model with paired sequence data

Question

Question on Fine-Tuning ESM-2 Model with paired sequence data

KrajShuffle opened this issue 4 days ago · comments

Hi,

I just had the opportunity to read through this lab's group paper and I appreciate the level of detail and justification in both the methodology and reported observations. I was wondering how or why the double "cls" token was decided as an appropriate means to separate the heavy and light chains when fine-tuning the ESM-2 model with the concatenated chain sequences. I understand the ESM-2 model was trained on single chain data, so I am curious on what considerations or approaches were tested when deciding how to format the paired-sequence data prior to feeding into the ESM-2 model.

Thanks for reading this and I would appreciate any insight on this topic!

Best,
Karthik