[Bug] Inconsistent Results between Direct Optimization and Sequential Optimization in TVM
Jupiterghy opened this issue · comments
When applying optimization passes in TVM, there is a discrepancy in the results between directly applying opt_a(opt_b(mod)) and using a sequential optimization approach, where seq_ab = tvm.ir.transform.Sequential([opt_a, opt_b]) and seq_ba = tvm.ir.transform.Sequential([opt_b, opt_a]) are used.
Additionally, this issue seems to occur specifically when one of the optimizations is FakeQuantizationToInteger
.
Actual behavior
Both structure and inference results of two mods which are applied optimizations by two ways are inconsistent.
Traceback information:
AssertionError:
Not equal to tolerance rtol=0.01, atol=0.01
Mismatched elements: 1 / 18 (5.56%)
Max absolute difference: 1.3370061e+08
Max relative difference: 3.0688565e+10
x: array([[[1.859827e-001]],
[[3.138568e-311]],...
y: array([[[ 1.859827e-001]],
[[ 1.022716e-321]],...
Environment
- Operating System: Ubuntu 18.04.5
- TVM version: 0.15.dev0
- ONNX: 1.15.0
Steps to reproduce
- Download the ONNX model
- Execute the script:
import onnx
import tvm
from tvm import relay
import numpy as np
def compare_outputs(output1, output2, rtol=1e-2, atol=1e-2):
if len(output1) != len(output2):
raise ValueError("Number of outputs in the two lists is different.")
for i in range(len(output1)):
output1_np = np.asarray(output1[i])
output2_np = np.asarray(output2[i])
np.testing.assert_allclose(output1_np, output2_np, rtol=rtol, atol=atol)
def compile_onnx(mod, params, inputs):
mod = relay.transform.InferType()(mod)
exec_mod = 'graph'
target = 'llvm'
ctx = tvm.cpu(0)
with tvm.transform.PassContext(opt_level=0):
executor = relay.build_module.create_executor(
exec_mod, mod, ctx, target, params
).evaluate()
output = executor(**inputs)
if isinstance(output, (tvm.runtime.container.ADT, list)):
output = [r.numpy() for r in output]
elif output is not None:
output = [output.numpy()]
return output
if __name__ == "__main__":
onnx_file = "model.onnx"
onnx_model = onnx.load(onnx_file)
shape_dict = {'v13_0': [1], 'v11_0': [1, 38, 27, 1], 'v7_0': [18, 1, 1]}
inputs = {}
inputs['v13_0'] = np.random.random([1])
inputs['v11_0'] = np.random.random([1, 38, 27, 1])
inputs['v7_0'] = np.random.random([18, 1, 1])
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict, freeze_params=True)
opt_a = tvm.relay.transform.AlterOpLayout()
opt_b = tvm.relay.transform.FakeQuantizationToInteger()
mod = tvm.relay.transform.InferType()(mod)
module_ab = opt_b(opt_a(mod))
module_ba = opt_a(opt_b(mod))
assert tvm.ir.structural_equal(module_ab, module_ba) #same
outputs_ab = compile_onnx(module_ab, params, inputs)
outputs_ba = compile_onnx(module_ba, params, inputs)
compare_outputs(outputs_ab, outputs_ba) #same
seq_ab = tvm.ir.transform.Sequential([opt_a, opt_b])
seq_ba = tvm.ir.transform.Sequential([opt_b, opt_a])
with tvm.transform.PassContext(opt_level=0):
module_ab_2 = seq_ab(mod)
module_ba_2 = seq_ba(mod)
assert tvm.ir.structural_equal(module_ab, module_ab_2) # assertion error, inconsistent
assert tvm.ir.structural_equal(module_ba, module_ba_2) # assertion error, inconsistent
assert tvm.ir.structural_equal(module_ab_2, module_ba_2) # assertion error, inconsistent
outputs_ab_2 = compile_onnx(module_ab_2, params, inputs)
outputs_ba_2 = compile_onnx(module_ba_2, params, inputs)
compare_outputs(outputs_ab, outputs_ab_2) # assertion error, inconsistent
compare_outputs(outputs_ba, outputs_ba_2) # assertion error, inconsistent
compare_outputs(outputs_ab_2, outputs_ba_2) # assertion error, inconsistent
Triage
- needs-triage
If you call a pass directly (instead of using Sequential
, it will bypass the check for opt_level
, required_pass
, etc.
If you call a pass directly (instead of using
Sequential
, it will bypass the check foropt_level
,required_pass
, etc.
Thank you for your response. However, I'm curious to understand why executing optimizations in the Sequential manner still results in inconsistency with different orderings.