EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Home Page:https://www.eleuther.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot convert neox model to HF

srivassid opened this issue · comments

Describe the bug
I get an error while converting neox model to HF
the error that i get

Traceback (most recent call last): File "/media/sid/WDInternal/stability_ai/gpt-neox/tools/convert_sequential_to_hf.py", line 318, in <module> hf_model = convert(args.input_dir, loaded_config, args.output_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/sid/WDInternal/stability_ai/gpt-neox/tools/convert_sequential_to_hf.py", line 156, in convert hf_config = create_config(loaded_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/sid/WDInternal/stability_ai/gpt-neox/tools/convert_sequential_to_hf.py", line 108, in create_config tokenizer = build_tokenizer(args) ^^^^^^^^^^^^^^^^^^^^^ File "/media/sid/WDInternal/stability_ai/gpt-neox/megatron/tokenizer/tokenizer.py", line 37, in build_tokenizer if args.tokenizer_type.lower() == "GPT2BPETokenizer".lower(): ^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'lower'

To Reproduce

python tools/ckpts/convert_neox_to_hf.py --input_dir checkpoints/global_step5000/ --config_fil checkpoints/global_step5000/configs/125M.yml --output_dir hf/

Expected behavior
The model should be converted to HF

Environment (please complete the following information):

  • GPUs: 3x3090
  • Configs:

If you look at checkpoints/global_step5000/configs/125M.yml, what is the value of tokenizer_type?

commented

This error occurs when your config doesn't include tokenizer_type which is required to generate HF configs. You might just add "tokenizer_type": "GPT2BPETokenizer" to your .yml file in case you are not sure which tokenizer the checkpoint was trained with.