google / maxtext

A simple, performant and scalable Jax LLM!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Convert Gemma weights

borisdayma opened this issue · comments

Hi,

Could you confirm which commit of google/grain to use when converting the Gemma weights?

It returns an error when using latest commit of both maxtext and grain as reported in google/grain#333

~/maxtext$ python MaxText/convert_gemma_chkpt.py --base_model_path $CHKPT_BUCKET/2b --maxtext_model_path $MODEL_BUCKET/2b --model_size 2b
Traceback (most recent call last):
  File "/home/boris/maxtext/MaxText/convert_gemma_chkpt.py", line 33, in <module>
    import checkpointing
  File "/home/boris/maxtext/MaxText/checkpointing.py", line 25, in <module>
    import grain.python as grain
  File "/home/boris/grain/grain/python.py", line 21, in <module>
    from . import python_experimental as experimental
  File "/home/boris/grain/grain/python_experimental.py", line 22, in <module>
    from . import python_lazy_dataset as lazy_dataset
  File "/home/boris/grain/grain/python_lazy_dataset.py", line 54, in <module>
    from ._src.python.lazy_dataset.transformations.shuffle import ShuffleLazyMapDataset
  File "/home/boris/grain/grain/_src/python/lazy_dataset/transformations/shuffle.py", line 18, in <module>
    from grain._src.python.experimental.index_shuffle.python import index_shuffle_module as index_shuffle
ImportError: cannot import name 'index_shuffle_module' from 'grain._src.python.experimental.index_shuffle.python' (unknown location)

A bit of progress:

# remove grain-nightly if installed from google/grain repo
pip uninstall grain-nightly

# install all requirements (seems a bit much just to convert weights…)
pip install -r requirements.txt

This works. Make sure to avoid conflicts with tensorflow versions (if you have a cpu version as well).

I'm very confused by this thread Boris -- our tests all look green. I'm trying to imagine what the difference could be?

I initially ran the script, then saw I needed grain.
So then I tried pip install grain but it didn't work so I cloned google/grain` and installed from it which leads to this error.

The trick was to install grain-nightly. I'm closing it as it's resolved.