Code for Tokenization?

Question

Code for Tokenization?

s4lome opened this issue 10 months ago · comments

Thank you for sharing this most exciting work!

I would like to know: Is the code for tokenizing different modalities not released yet or am I failing to read where in the code the tokenization happens?

I would like to use Meta Transformer on a custom Data Set, with image and text inputs.

As far as I understood the workflow would be:

token_text, token_image = tokenize(text), tokenize(image)

embedding_text = pretrained_encoder(token_text)  # as described in demo
embedding_image = pretrained_encoder(token_image)  # as described in demo

downstream_task(embedding_text, embedding_image)

Is this correct on a very high level?

Thanks in advance!

Yiyuan Zhang · Answer 1 · Tue Aug 15 2023 22:37:59 GMT+0800 (China Standard Time)

Thank you for your interest in Meta-Transformer. The tokenization part will be released in 1-2 days, and I've worked on this for about 10 days, which I hope could be easy to use. On the custom dataset, your pseudo code is accurate.

If you have additional questions, please feel free to let me know, and I'm willing to offer my help~