'HuggingFaceWrapper' object has no attribute 'save_pretrained'

Question

'HuggingFaceWrapper' object has no attribute 'save_pretrained'

aaronbannin opened this issue 7 months ago · comments

I'm following the examples for fine tuning with HF Transformers.

When executing the Learner, the following error is thrown:

learn.model.save_pretrained("pythia-70m-trained")

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-11-d1c51ba9b240>](https://localhost:8080/#) in <cell line: 1>()
----> 1 learn.model.save_pretrained("pythia-70m-trained")

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
   1693             if name in modules:
   1694                 return modules[name]
-> 1695         raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
   1696 
   1697     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'HuggingFaceWrapper' object has no attribute 'save_pretrained'

From reading the source, it looks like the model is encapsulated, not inherited? Which would mean that the desired method is bound to self.model?

Running the following script in Google Colab:

from fastai.text.all import *
from fastxtend.text.all import *

from datasets import load_dataset
from transformers import AutoModel, AutoTokenizer, DataCollatorWithPadding

dataset = load_dataset("TinyPixel/orca-mini", split='train').train_test_split(test_size=0.3)

model_name = "EleutherAI/pythia-70m"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained('distilroberta-base')

batch_size = 5


data_collator=DataCollatorWithPadding(tokenizer)
train_dataloader = HuggingFaceLoader(
    dataset['train'].with_format('torch'), batch_size=batch_size,
    collate_fn=data_collator, shuffle=True,
    drop_last=True, num_workers=num_cpus()
)

valid_dataloader = HuggingFaceLoader(
    dataset['test'].with_format('torch'), batch_size=16,
    collate_fn=DataCollatorWithPadding(tokenizer), shuffle=False,
    drop_last=False, num_workers=num_cpus()
)

dls = DataLoaders(train_dataloader, valid_dataloader)

learn = Learner(
    dls, 
    model, 
    loss_func=HuggingFaceLoss(),
    cbs=HuggingFaceCallback()
  ).to_bf16()
  
learn.model.save_pretrained("pythia-70m-trained")

Benjamin Warner · Answer 1 · Sat Dec 02 2023 12:04:47 GMT+0800 (China Standard Time)

The documentation is a bit unclear, but currently the HuggingFaceCallback works by wrapping the Transformers model to handle the incompatible format between fast.ai and Transformers. When training is finished, the Transformers model is automatically unwrapped and available from learn.model.

Before training is finished, the model needs to be accessed from learn.model.hf_model. For example, to save a mid-training checkpoint you'd use:

learn.model.hf_model.save_pretrained("pythia-70m-trained")

Benjamin Warner · Answer 2 · Wed Dec 06 2023 02:38:32 GMT+0800 (China Standard Time)

#22 will resolve this by adding a new attribute for accessing the original Transformers model.

learn.hf_model.save_pretrained("pythia-70m-trained")

The model will no longer be unwrapped automatically after training is over. learn.hf_model will be created by the HuggingFaceCallback during after_create, so it will be accessible from the beginning.