the new pytorch.bin is bigger than original model issue

Question

the new pytorch.bin is bigger than original model issue

lb553024300 opened this issue 7 months ago · comments

When I choose save model, I found some strange things。The new pytorch.bin is bigger than original model。I choose Baichuan-7B ,--pruning_ratio 0.5 for test, and add --save_model for save the model after pruning. but the new pytorch.bin is 17GB, and the original model is only 13GB?
Could you please tell me why? Thank you!

Gongfan Fang · Answer 1 · Sun Nov 19 2023 01:31:26 GMT+0800 (China Standard Time)

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

lb553024300 · Answer 2 · Sun Nov 19 2023 10:54:45 GMT+0800 (China Standard Time)

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

Get it,thanks

Horseee · Answer 3 · Sun Nov 19 2023 12:58:38 GMT+0800 (China Standard Time)

Hi. Could you please check if you deleted the gradient for calculating Taylor before saving the model?

Trịnh Đỗ Duy Hưng · Answer 4 · Mon Dec 04 2023 23:57:48 GMT+0800 (China Standard Time)

Hi @VainF, can you tell me how to convert it back to the same format as the original model? The lighter capacity is what I need for storage. Thank you very much

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

I tried torch.save(model.state_dict()) but the capacity is still the same, is there any way to save the same as the original model on huggingface? I tried loading the model and saving it but it still didn't reduce the size.

import torch
import argparse

def main(args):
    pruned_dict = torch.load(args.ckpt, map_location='cpu')
    tokenizer, model = pruned_dict['tokenizer'], pruned_dict['model']

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB")

    # Remove gradient
    model.zero_grad()
    for name, module in model.named_parameters():
        if 'weight' in name:
            module.grad = None

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~ 25GB

    # model.half()
    # print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~12GB

    # Save
    model.save_pretrained(args.output_dir) #=> ~ 25GB
    tokenizer.save_pretrained(args.output_dir)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--ckpt", type=str, required=True)
    parser.add_argument("--output_dir", type=str, required=True)
    args = parser.parse_args()
    main(args)

I am working with the Bloom model