beichenzbc / Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tokenize

ucasyjz opened this issue · comments

Is there any difference in tokenizer between 'longclip.tokenize' and original clip tokenize in diffusers, can you give me some guidance, thanks. I change the length from 77 to 248 in original clip tokenizer config, but the output features embedding is different from the 'longclip.tokenize'

there's no difference except from the positional embedding. The file ./model/simple_tokenizer.py didn't change. You may refer to ./model/longclip.py for further details.