TobiasHeOl / AbLang2

An antibody-specific language model focusing on NGL prediction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MMSeqs2 parameters

mhorlacher opened this issue · comments

In the paper it is written that sequences were split into train/eval/test via MMSeqs2 clustering with a sequence_identity threshold of 0.95 - could you provide the full set of parameters used for clustering, or were the remaining ones left to be the default? Thanks!

Hi Marc, thank you for the question. Other than using "--cov-mode 1" to better handle fragmented sequences, the rest of the parameters used for clustering were the default ones.

I hope this helps!

Helps a lot, thanks!