warner-benjamin / fastxtend

Train fastai models faster (and other useful tools)

Home Page:https://fastxtend.benjaminwarner.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'HuggingFaceWrapper' object has no attribute 'save_pretrained'

aaronbannin opened this issue · comments

I'm following the examples for fine tuning with HF Transformers.

When executing the Learner, the following error is thrown:

learn.model.save_pretrained("pythia-70m-trained")

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-11-d1c51ba9b240>](https://localhost:8080/#) in <cell line: 1>()
----> 1 learn.model.save_pretrained("pythia-70m-trained")

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
   1693             if name in modules:
   1694                 return modules[name]
-> 1695         raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
   1696 
   1697     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'HuggingFaceWrapper' object has no attribute 'save_pretrained'

From reading the source, it looks like the model is encapsulated, not inherited? Which would mean that the desired method is bound to self.model?

Running the following script in Google Colab:

from fastai.text.all import *
from fastxtend.text.all import *

from datasets import load_dataset
from transformers import AutoModel, AutoTokenizer, DataCollatorWithPadding

dataset = load_dataset("TinyPixel/orca-mini", split='train').train_test_split(test_size=0.3)

model_name = "EleutherAI/pythia-70m"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained('distilroberta-base')

batch_size = 5


data_collator=DataCollatorWithPadding(tokenizer)
train_dataloader = HuggingFaceLoader(
    dataset['train'].with_format('torch'), batch_size=batch_size,
    collate_fn=data_collator, shuffle=True,
    drop_last=True, num_workers=num_cpus()
)

valid_dataloader = HuggingFaceLoader(
    dataset['test'].with_format('torch'), batch_size=16,
    collate_fn=DataCollatorWithPadding(tokenizer), shuffle=False,
    drop_last=False, num_workers=num_cpus()
)

dls = DataLoaders(train_dataloader, valid_dataloader)

learn = Learner(
    dls, 
    model, 
    loss_func=HuggingFaceLoss(),
    cbs=HuggingFaceCallback()
  ).to_bf16()
  
learn.model.save_pretrained("pythia-70m-trained")

The documentation is a bit unclear, but currently the HuggingFaceCallback works by wrapping the Transformers model to handle the incompatible format between fast.ai and Transformers. When training is finished, the Transformers model is automatically unwrapped and available from learn.model.

Before training is finished, the model needs to be accessed from learn.model.hf_model. For example, to save a mid-training checkpoint you'd use:

learn.model.hf_model.save_pretrained("pythia-70m-trained")

#22 will resolve this by adding a new attribute for accessing the original Transformers model.

learn.hf_model.save_pretrained("pythia-70m-trained")

The model will no longer be unwrapped automatically after training is over. learn.hf_model will be created by the HuggingFaceCallback during after_create, so it will be accessible from the beginning.