EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Home Page:https://www.eleuther.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

convert_hf_to_module(pipeline_parallel>1)

liuxinxin123 opened this issue · comments

hi,
I see that we support to convert a HuggingFace transformers to a NeoX model with pipeline parallelism without pipeline parallelism by 'convert_hf_to_sequential.py'.

Can we support to convert a HuggingFace transformers to a NeoX model with pipeline parallelism without pipeline parallelism greater than 1, or how to do now?

Thank you for the issue. The language you're using is a little confusing to me. Am I correct in thinking you want to go HF -> NeoX w/ PP = 1? That is, you can do PP > 1 and no PP but not PP = 1?

@liuxinxin123 Hey, I wanted to follow up on this. Can you elaborate on what your issue is?