param_first and param_mix result the same ppl
Kausal-Lei opened this issue · comments
I simply use the following commands to run:
python hf_prune.py --pruning_ratio 0.62785 --block_wise --block_mlp_layer_start 0 --block_mlp_layer_end 32 --block_attention_layer_start 32 --block_attention_layer_end 32 --pruner_type taylor --base_model /mnt/petrelfs/xxx/llama2-7b --device cpu --eval_device cuda --taylor param_first --save_ckpt_log_name llama_prune --save_model --num_examples 128
python hf_prune.py --pruning_ratio 0.62785 --block_wise --block_mlp_layer_start 0 --block_mlp_layer_end 32 --block_attention_layer_start 32 --block_attention_layer_end 32 --pruner_type taylor --base_model /mnt/petrelfs/xxx/llama2-7b --device cpu --eval_device cuda --taylor param_mix --save_ckpt_log_name llama_prune --save_model --num_examples 128
But they result in the same ppl.
It seems that the grad is very small(e.g. 1e-5), so the acc_gard is near to zero, which will have little effect.