Zasder3 / train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

custom tokenizer and text encoder

sinjohr opened this issue · comments

I want to use custom tokenizer and encoder trained from huggingface tokenizer.

After training the huggingface tokenizer, I got a json containing vocas.

However, I don't know how to feed this custom tokenizer with train_finetune.py.

Could you give some guide to set and use custom tokenizer?

My problem is the same as yours. Please reply me if you solve it. Thank you