[onert] Introduce full quantization
hseok-oh opened this issue · comments
Let's support full quantization on runtime
- Introduce full quantization type: #11497
- Full quantization from circle model including minmax data & weight quantization
- Generate circle model (buffer) including layer minmax & weight quantization for full quantization from f32 circle and minmax data
Use minmax-embedder library
- Remove HDF5 dependency: #12574
- Revise observers to introduce execution config API #13039
- Introduce API to collect minmax data
Draft: #12903
Example
$ MINMAX_DUMP=1 ./Product/out/bin/onert_run -r 100 mobilenet_v1_1.0_224.circle
$ ./Product/out/bin/onert_run -q uint8 mobilenet_v1_1.0_224.circle
$ ./Product/out/bin/onert_run mobilenet_v1_1.0_224_quantized_q8.circle