Support for smaller quantization, 8 or 4 at least
Proryanator opened this issue Β· comments
This tool is amazing, having tried scripting using the coreml library by hand, running into all kinds of fun issues, then trying this and it all being orchestrated/abstracted for you, this is excellent π
I noticed that there's only quantization support for down to 16 bits however, and would love to have smaller options. I do believe CoreML is capable of these so it may just be adding that call to this wrapper.
I did look in convert.py
and I do see a flag use_legacy_format
being checked before performing quantize 16, is there something different with how the ML Program handles or does lower bit quantization?
I realized that you can still quantize a coreml model after it's been made, can probably disregard this issue. Will try quantizing some existing coreml models I found.
So having this tool convert it, then doing further quantizing after should work!