huggingface / exporters

Export Hugging Face models to Core ML and TensorFlow Lite

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

M2M100 Example?

fakerybakery opened this issue · comments

Hello,
I'm trying to convert M2M100 to CoreML. I saw that it is partially supported, and I was wondering if there's any example script to do this.
Here's what I tried:

from exporters.coreml.models import M2M100CoreMLConfig
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
model_ckpt = "facebook/m2m100_418M"
base_model = M2M100ForConditionalGeneration.from_pretrained(
    model_ckpt, torchscript=True
)
preprocessor = M2M100Tokenizer.from_pretrained(model_ckpt)
coreml_config = M2M100CoreMLConfig(
    base_model.config, 
    task="text2text-generation",
    use_past=False,
)
mlmodel = export(
    preprocessor, base_model, coreml_config
)

However, when trying to run this code, I get the following error:

ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

Thank you in advance!

Hi @fakerybakery! I think the easiest way is to use this automated Space, which uses exporters under the hood: https://huggingface.co/spaces/huggingface-projects/transformers-to-coreml

You enter the model id (facebook/m2m100_418M), then select the task you want the model to perform (text-to-text generation) and the encoder and decoder Core ML models will be pushed to a new repo or submitted as a PR to the original one.

I just followed this procedure and pushed the result to this repo. Feel free to clone it if you need to, or repeat the process yourself using different conversion settings.

Hope that helps.

Thank you so much!
I'm new to CoreML, so do you know if there's an example on how to implement a CoreML text2text-generation model in Swift? I checked huggingface/swift-coreml-transformers, however I couldn't find an example.
Thank you!

The project huggingface/swift-coreml-transformers only have created two tokenizers: for GPT and Bert.

If you want to use an M2M100 model you have to find or create an M2M100 swift tokenizer first.

I've been trying also a text2text-generation and the only solution I found is to use a GPT model like microsoft/DialoGPT-small

I based my code in the huggingface/swift-coreml-transformers and created this one madcato/huggingface-coreml to experiment.

Sorry is a little messed. Take a look to my GPT2Model (is almost the same than the original), with a new method to slice faster than the original.

What I recommend you is to find a GPT or Bert model that can be exported.