Samsung / ONE

On-device Neural Engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[one-cmds/one-optimize] A redundant `Transpose` remains when using `include=O1`

lemmaa opened this issue · comments

What

When using include=O1 and convert_nchw_to_nhwc=True together, if nchw_to_nhwc_input_shape=True or nchw_to_nhwc_output_shape=True are mixed with convert_nchw_to_nhwc=True, a duplicate transpose will remain.

Why

The remove_redundant_transpose=True is naturally included in O1. However, this may be because the options included in O1 are applied first in order of priority, and then transpose is applied to the input and output by convert_nchw_to_nhwc=True, and nchw_to_nhwc_input_shape=True or nchw_to_nhwc_output_shape=True. Therefore, remove_redundant_transpose=True included in O1 is not fully effective.

[onecc]
include=O1

[one-optimize]
convert_nchw_to_nhwc=True           # This adds one `Transpose` right after input or just before output.
nchw_to_nhwc_input_shape=True       # This adds another `Transpose` right after the input.
nchw_to_nhwc_output_shape=True      # This adds another `Transpose` right before the output.

As a workaround, there is a way to explicitly add remove_redundant_transpose=True one more time.

[onecc]
include=O1

[one-optimize]
convert_nchw_to_nhwc=True
nchw_to_nhwc_input_shape=True
nchw_to_nhwc_output_shape=True
remove_redundant_transpose=True     # Add this option explicitly once more to remove redundant `Transpose`.

this may be because the options included in O1 are applied first in order of priority, and then transpose is applied to the input and output by convert_nchw_to_nhwc=True, and nchw_to_nhwc_input_shape=True or nchw_to_nhwc_output_shape=True.

To avoid such a case, we perform convert_nchw_to_nhwc (including nchw_to_nhwc_input_shape and nchw_to_nhwc_output_shape) before other optimizations are applied. #7376

Could you share an example model for further investigation?