intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Home Page:https://intel.github.io/neural-compressor/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

potential bug in calculating scale,zp in sq

wenhuach21 opened this issue · comments

after multiply input_scale, the min max value may be not always in min max respectively.

def _calculate_qparams(self, input_scale, input_minmax, dtype=torch.quint8):
# calculate scale and zero_point
if dtype == torch.quint8:
quant_min, quant_max = 0, 255
min_val = torch.min(input_minmax[0] * input_scale)
max_val = torch.max(input_minmax[1] * input_scale)
# work when min_val bigger than zero.
min_val_neg = torch.min(min_val, torch.zeros_like(min_val))
max_val_pos = torch.max(max_val, torch.zeros_like(max_val))
scale = (max_val_pos - min_val_neg) / float(quant_max - quant_min)
scale = torch.max(scale, torch.tensor([torch.finfo(torch.float32).eps], device=scale.device))
zero_point = quant_min - torch.round(min_val_neg / scale).to(torch.int)
zero_point = torch.clamp(zero_point, quant_min, quant_max)
return scale, zero_point