Hyperparameter for training SEED Tokenizer

Question

Hyperparameter for training SEED Tokenizer

Cheolhyun-Mun opened this issue 6 months ago · comments

Hi!
Thank you for the wonderful work.

I wonder if you can provide detailed information on training SEED Tokenizer.
I cannot find the hyperparameter for training SEED Tokenizer in your paper.

Also, I have another question.
In the paper, SEED Tokenizer training is divided into two stages. Does that mean the Q-former is pre-trained in stage 1 and then the Q-former, codebook, decoder, and MLP are trained in stage 2?

Thank you.

Yuying Ge · Answer 1 · Mon Feb 26 2024 11:55:05 GMT+0800 (China Standard Time)

We have released the training code of SEED-LLaMa, including SEED tokenizer, Multimodal LLM pretraining and instruction tuning. Our Multimodal LLM training codebase supports 1. large-scale multi-node training with deepspeed 2. highly-efficient multiple training datapipes.