bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to continue pre-training Bloom?

ShinoharaHare opened this issue · comments

Hi
I'm trying to continue pre-training the bloom-560m on my own dataset on a single GPU.
I modified this script to fit my case.
However, i cannot figure out how to load the checkpoint.

Is there any guide for what i'm doing?

commented

Hi,
did you find any solution on this?

@ShinoharaHare Hi, have you solved this problem?