google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Roadmap for supporting more operations with QS8

shanumante-sc opened this issue · comments

Currently XNNPACK only supports a subset of quantized operations for QS8 inference (conv, fully connected, add, global avg pool); a much larger set of operators is supported for QU8. Are there plans to implement missing QS8 operators?

For our use case specifically, it would be great to have avg/max pooling; deconvolution; sigmoid/softmax/clamp operators supported with QS8 inference.

In the near term, QS8 support will focus on missing operators from MobileNet v3, i.e. mul and hard swish. If you'd like to have other operators, contributions are welcome!