How to resume?

Question

How to resume?

rotorliu opened this issue a year ago · comments

Hi, @tianyic
I run OTO on my project. However, there is an error which describe "ValueError: loaded state dict has a different number of parameter groups" while the train resumes from checkpoint.

Tianyi Chen · Answer 1 · Fri Mar 31 2023 12:42:53 GMT+0800 (China Standard Time)

Thanks for reaching out.

Resume can be conducted as the below.

model = torch.load(checkpoint_path)
dummy_input = sth
oto = OTO(model=model, dummy_input=dummy_input)
optimizer = oto.dhspg(
   # set lr, start_pruning etc  as normal
   fixed_zero_groups=True # if want to preserve the learnt group sparsity from previous training round. 
)

rotorliu · Answer 2 · Fri Mar 31 2023 18:36:02 GMT+0800 (China Standard Time)

I make a pull request about resuming.
#11

Tianyi Chen · Answer 3 · Sat Apr 01 2023 05:18:42 GMT+0800 (China Standard Time)

Thanks for the PR!