Code is not friendly

Question

Code is not friendly

Lisennlp opened this issue 5 months ago · comments

I am very interested in this work, but I found that the code structure is too deep, many configs are hidden deeply, and they are all hard-coded, making it difficult to run.

Xue Fuzhao · Answer 1 · Wed Jan 31 2024 16:15:35 GMT+0800 (China Standard Time)

Thank you for the interest! Our Jax training code is largely based on T5x (https://github.com/google-research/t5x), so we can do nothing on it. (I personally think it is actually easy to use. Maybe you can check T5x documentation.) And if you are interested in using pytorch instead, we have a demo here (https://colab.research.google.com/drive/1xIfIVafnlCP2XVICmRwkUFK3cwTJYjCY). You can use it easily by running sth like: https://github.com/XueFuzhao/OpenMoE?tab=readme-ov-file#inference-with-pytorch

Hope this helps :)

Lisennlp · Answer 2 · Wed Jan 31 2024 18:17:16 GMT+0800 (China Standard Time)

Thanks Reply. I followed the readme of tpu.
git clone https://github.com/XueFuzhao/OpenMoE.git \ bash OpenMoE/script/run_pretrain.sh

What I understand is that the configuration file is: t5x/t5x/examples/t5/t5_1_1/examples/openmoe_large.gin.

Therefore, I modified the sentence.model path inside, but the path you set still appears (gs://fuzhao/tokenizers/umt5.256000/sentencepiece.model).

In addition, for example, if I want to change the c4 datasets for training, it will not work if I directly change MIXTURE_OR_TASK_NAME='c4'.

For people who are not familiar with this code, it would be very nice if there is a more detailed readme. I hope the author can consider it.
Thank you.