nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

Home Page:https://transformersbook.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chapter 8 - Error exporting when trying to use ONNX format.

r0llingclouds opened this issue · comments

Information

The problem arises in chapter:

  • Introduction
  • Text Classification
  • Transformer Anatomy
  • Multilingual Named Entity Recognition
  • Text Generation
  • Summarization
  • Question Answering
  • Making Transformers Efficient in Production
  • Dealing with Few to No Labels
  • Training Transformers from Scratch
  • Future Directions

Describe the bug

An error is raised when trying to convert the distilled model to ONNX format, in this cell:

from transformers.convert_graph_to_onnx import convert

model_ckpt = "transformersbook/distilbert-base-uncased-distilled-clinc"
onnx_model_path = Path("onnx/model.onnx")
convert(framework="pt", model=model_ckpt, tokenizer=tokenizer, 
        output=onnx_model_path, opset=12, pipeline_name="text-classification")

The following error is produced:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-61-766cfdd4c2c6>](https://fidq6830y79-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230824-060124-RC01_559862993#) in <cell line: 6>()
      4 model_ckpt = "transformersbook/distilbert-base-uncased-distilled-clinc"
      5 onnx_model_path = Path("onnx/model.onnx")
----> 6 convert(framework="pt", model=model_ckpt, tokenizer=tokenizer, 
      7         output=onnx_model_path, opset=12, pipeline_name="text-classification")

1 frames
[/usr/local/lib/python3.10/dist-packages/transformers/convert_graph_to_onnx.py](https://fidq6830y79-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230824-060124-RC01_559862993#) in convert_pytorch(nlp, opset, output, use_external_format)
    279         ordered_input_names, model_args = ensure_valid_input(nlp.model, tokens, input_names)
    280 
--> 281         export(
    282             nlp.model,
    283             model_args,

TypeError: export() got an unexpected keyword argument 'use_external_data_format'

To Reproduce

Steps to reproduce the behavior:

  1. Run chapter 8 in Google Collab (V100 GPU)
  2. Execute that cell

Expected behavior

The cell should just execute properly and save the model to ONNX format.

To solve it, I had to open the convert_graph_to_onnx.py file of transformers library and comment the use_external_data and enable_onnx_checker params of export function (lines 289-290):

export(
    nlp.model,
    model_args,
    f=output.as_posix(),
    input_names=ordered_input_names,
    output_names=output_names,
    dynamic_axes=dynamic_axes,
    do_constant_folding=True,
    # use_external_data_format=use_external_format,
    # enable_onnx_checker=True,
    opset_version=opset,
)