microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

屎山代码DeepSpeed

ControllableGeneration opened this issue · comments

如题

Please, spend time cleaning the code, DS team!!! The code is really hard to modify!!!!

Plus, why for PipelineEngine, there is no forward function? eval_batch method is really hard to use and not portable at all!