hatchetProject / QuEST

QuEST: Efficient Finetuning for Low-bit Diffusion Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] The time-step seems not set in the quantization init.

csguoh opened this issue · comments

Hi, sorry to bother you.

I attempted to reproduce the results on lsun_bedroom using the command line you have provided in README. However, I met the bug in the quantization init stage of _ = qnn(cali_xs[:8].cuda(), cali_ts[:8].cuda()) .

The error log is as follows:

屏幕截图 2024-06-06 110508

It seems that the timestep is not set in the class QuantSMVMatMul. I also find there is actually a function set_timestep in the quant_block.py. However, it seems this function is not used to init the timestep of the QuantSMVMatMul.

So I wonder how to fix this bug? Where should I add the set_timestep function to allow the self.timestep in QuantSMVMatMul is not None? Thanks for your time:)

Hi, I assume you are running on the bedroom/church dataset. You can add

qnn.set_timestep(timesteps[0])

before

_ = qnn(cali_xs[:8].cuda(), cali_ts[:8].cuda())

to resolve the issue. This just makes the self.timestep not None upon initialization. You can also refer to sample_diffusion_ldm_imagenet.py to see its usage :)

Thanks for your reply, I will try it ;)
BTW, I also find a BUG in the generation of calibration dataset of ImageNet. The command line in this uses the --cond = Ture, while the code in line 467 of this file use the opposite assert assert(opt.cond), which raises a assert ERROR. How should I fix this?

I see, it should be a typo due to my laziness when coding. You can delete either one or both of them, but don't include both of them. I will also fix it. Thanks :)

I will try it. Thanks for your reply!

I am sorry to bother you again...

After modifying the code with additional qnn.set_timestep(timesteps[0]), the implementation of sample_diffusion_ldm_bedroom.py met another error during the process of block reconstruction:

Snipaste_2024-06-06_20-50-47

It seems the block reconstruction of qkv_matmul class has no parameters to optimize. How to fix this bug, thanks!

(Wrong reply, see issue #11) QKMatmul and SMVMatMul blocks are actually never optimized, and their functionalities are included in the QuantBasicTransformerBlock(). This is a long existent confusion left by QDiffusion. To resolve this issue, you just ignore them by: 1. commenting line 367~371 in qdiff/quant_block.py when constructing the model or 2. specify the block type from BaseQuantBlock to (QuantResBlock,QuantBasicTransformerBlock) when doing reconstruction in line 539 of sample_diffusion_ldm_bedroom.py (remember to import QuantResBlock).

Refer to the below reply for the correct solution.
The current part you are running is solely the weight quantization process of QDiffusion. If you want to skip it, you can download their checkpoints and use --resume_w command to load the weight quantization parameters. There is not much performance difference with our current implementation.

@csguoh Hi, sorry for providing the wrong solution. I realized I have made a big mistake in the solutions provided (credits to @seung-hoon-lee) and I apologize for the inconvenience. The above solutions will cause attention matrices not quantized in LDMs. You can refer to issue #11 for details.

So the correct way is, see this link. I think you can implement by:

  1. Specifying the block type to (QuantResBlock,QuantBasicTransformerBlock, QuantAttentionBlock);
  2. or set self.ignore_reconstruction = True in QuantQKMatMul and QuantSMVMatMul during weight reconstruction.

The most reliable way for implementation is to align with QDiffusion: Quantize the weights first (without specifying the --quant_act argument), save the checkpoint, then reload it and quantize the activations.

Thanks for your reply!