XueFuzhao / OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Code is not friendly

Lisennlp opened this issue · comments

I am very interested in this work, but I found that the code structure is too deep, many configs are hidden deeply, and they are all hard-coded, making it difficult to run.

Thank you for the interest! Our Jax training code is largely based on T5x (https://github.com/google-research/t5x), so we can do nothing on it. (I personally think it is actually easy to use. Maybe you can check T5x documentation.) And if you are interested in using pytorch instead, we have a demo here (https://colab.research.google.com/drive/1xIfIVafnlCP2XVICmRwkUFK3cwTJYjCY). You can use it easily by running sth like: https://github.com/XueFuzhao/OpenMoE?tab=readme-ov-file#inference-with-pytorch

Hope this helps :)

Thanks Reply. I followed the readme of tpu.
git clone https://github.com/XueFuzhao/OpenMoE.git \ bash OpenMoE/script/run_pretrain.sh

What I understand is that the configuration file is: t5x/t5x/examples/t5/t5_1_1/examples/openmoe_large.gin.

Therefore, I modified the sentence.model path inside, but the path you set still appears (gs://fuzhao/tokenizers/umt5.256000/sentencepiece.model).

image

In addition, for example, if I want to change the c4 datasets for training, it will not work if I directly change MIXTURE_OR_TASK_NAME='c4'.

For people who are not familiar with this code, it would be very nice if there is a more detailed readme. I hope the author can consider it.
Thank you.