Infinity scale in calibration cache file
liming312 opened this issue · comments
Hello,
Thanks for this great repo. I used the provided script to convert a searched network (onnx model) to int8 model with int8 calibration. The onnx model works fine but the output of trt engine is wrong. When I check the calibration cache file, I found many of the quantization scales are "Infinity" and the others are very large. I converted the cache file to json file, it looks as the attached image below. Any hint is very appreciated.
Environment
TensorRT Version: 7.0.0.11
GPU Type: Tesla T4
Nvidia Driver Version: 418.67
CUDA Version: 10.0
CUDNN Version: 7.6.5
Operating System + Version: Debian 9.11
Python Version (if applicable): 3.7.4
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): 1.4.0
Baremetal or Container (if container which image + tag): An internal Docker image