horseee / LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support LLaMA, Llama-2, BLOOM, Vicuna, Baichuan, etc.

Home Page:https://arxiv.org/abs/2305.11627

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

the new pytorch.bin is bigger than original model issue

lb553024300 opened this issue · comments

When I choose save model, I found some strange things。The new pytorch.bin is bigger than original model。I choose Baichuan-7B ,--pruning_ratio 0.5 for test, and add --save_model for save the model after pruning. but the new pytorch.bin is 17GB, and the original model is only 13GB?
Could you please tell me why? Thank you!

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

Get it,thanks

Hi. Could you please check if you deleted the gradient for calculating Taylor before saving the model?

Hi @VainF, can you tell me how to convert it back to the same format as the original model? The lighter capacity is what I need for storage. Thank you very much

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

I tried torch.save(model.state_dict()) but the capacity is still the same, is there any way to save the same as the original model on huggingface? I tried loading the model and saving it but it still didn't reduce the size.

import torch
import argparse

def main(args):
    pruned_dict = torch.load(args.ckpt, map_location='cpu')
    tokenizer, model = pruned_dict['tokenizer'], pruned_dict['model']

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB")

    # Remove gradient
    model.zero_grad()
    for name, module in model.named_parameters():
        if 'weight' in name:
            module.grad = None

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~ 25GB

    # model.half()
    # print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~12GB

    # Save
    model.save_pretrained(args.output_dir) #=> ~ 25GB
    tokenizer.save_pretrained(args.output_dir)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--ckpt", type=str, required=True)
    parser.add_argument("--output_dir", type=str, required=True)
    args = parser.parse_args()
    main(args)

I am working with the Bloom model