Convert Gemma weights
borisdayma opened this issue · comments
Hi,
Could you confirm which commit of google/grain to use when converting the Gemma weights?
It returns an error when using latest commit of both maxtext
and grain
as reported in google/grain#333
~/maxtext$ python MaxText/convert_gemma_chkpt.py --base_model_path $CHKPT_BUCKET/2b --maxtext_model_path $MODEL_BUCKET/2b --model_size 2b
Traceback (most recent call last):
File "/home/boris/maxtext/MaxText/convert_gemma_chkpt.py", line 33, in <module>
import checkpointing
File "/home/boris/maxtext/MaxText/checkpointing.py", line 25, in <module>
import grain.python as grain
File "/home/boris/grain/grain/python.py", line 21, in <module>
from . import python_experimental as experimental
File "/home/boris/grain/grain/python_experimental.py", line 22, in <module>
from . import python_lazy_dataset as lazy_dataset
File "/home/boris/grain/grain/python_lazy_dataset.py", line 54, in <module>
from ._src.python.lazy_dataset.transformations.shuffle import ShuffleLazyMapDataset
File "/home/boris/grain/grain/_src/python/lazy_dataset/transformations/shuffle.py", line 18, in <module>
from grain._src.python.experimental.index_shuffle.python import index_shuffle_module as index_shuffle
ImportError: cannot import name 'index_shuffle_module' from 'grain._src.python.experimental.index_shuffle.python' (unknown location)
A bit of progress:
# remove grain-nightly if installed from google/grain repo
pip uninstall grain-nightly
# install all requirements (seems a bit much just to convert weights…)
pip install -r requirements.txt
This works. Make sure to avoid conflicts with tensorflow versions (if you have a cpu version as well).
I'm very confused by this thread Boris -- our tests all look green. I'm trying to imagine what the difference could be?
I initially ran the script, then saw I needed grain
.
So then I tried pip install grain but it didn't work so I cloned
google/grain` and installed from it which leads to this error.
The trick was to install grain-nightly
. I'm closing it as it's resolved.