[Performance] QNN EP Leaving QDQ Nodes in the QNN Graph

Question

[Performance] QNN EP Leaving QDQ Nodes in the QNN Graph

kory opened this issue 21 days ago · comments

Describe the issue

The QNN EP appears to be leaving QDQ nodes in the QNN graph for Mul and Sub, rather than inserting quantized ops. I have included a sample model below.

To reproduce

See attached model.

Affects Inception_v3 among other models.

Urgency

No response

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

Latest

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

QNN EP

Execution Provider Library Version

QNN 2.20

Model File

qdq_min_mul_sub_example.zip

Is this a quantized model?

Yes