[circle2circle] Introduce a pass to simulate mixed-precision operators

Question

[circle2circle] Introduce a pass to simulate mixed-precision operators

jinevening opened this issue 2 months ago · comments

Hyukjin Jeong commented 2 months ago

What

Let's introduce a new pass RemoveQDQForMixedPrecisionOp

Why

When we make a fake-quantized model, sometimes duplicate QDQ(Quantize-Dequantize) patterns appear as below.

In the above example, the first QDQ is for q8, and the second QDQ is for q16. For some backends, FC layer can directly generate q16 output even though its inputs are q8 (for higher accuracy). This is often called 'mixed-precision operator'.

To simulate the behavior of mixed-precision operator, we need a pass to remove the first QDQ pattern in the above pattern.

Hyukjin Jeong commented 2 months ago

Done

Hyukjin Jeong · Answer 1 · Fri May 03 2024 16:18:37 GMT+0800 (China Standard Time)

I used [circle2circle] tag, because I'm not sure it is ok to expose this option to users (one-optimize).