Use LiLT / an alternative model with more than 512 tokens
coding-kt opened this issue · comments
Hi,
LiLT processes a maximum of 512 tokens.
Is there a good option to get a comparable and commercial useable model that can process more tokens?
It is of course possible to split longer inputs into 512 token chunks. But this comes with some disadvantages / difficulties.
Hi,
LiLT uses the token length 512 during the pre-training phase. The most direct way to process long document is to split the document into chunks with length 512. Alternatively, it is possible to consider linearly resizing the position embedding and then fine-tuning it.