huggingface / exporters

Export Hugging Face models to Core ML and TensorFlow Lite

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for smaller quantization, 8 or 4 at least

Proryanator opened this issue Β· comments

This tool is amazing, having tried scripting using the coreml library by hand, running into all kinds of fun issues, then trying this and it all being orchestrated/abstracted for you, this is excellent πŸ‘

I noticed that there's only quantization support for down to 16 bits however, and would love to have smaller options. I do believe CoreML is capable of these so it may just be adding that call to this wrapper.

I did look in convert.py and I do see a flag use_legacy_format being checked before performing quantize 16, is there something different with how the ML Program handles or does lower bit quantization?

I realized that you can still quantize a coreml model after it's been made, can probably disregard this issue. Will try quantizing some existing coreml models I found.

So having this tool convert it, then doing further quantizing after should work!