'HuggingFaceWrapper' object has no attribute 'save_pretrained'
aaronbannin opened this issue · comments
I'm following the examples for fine tuning with HF Transformers.
When executing the Learner, the following error is thrown:
learn.model.save_pretrained("pythia-70m-trained")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
[<ipython-input-11-d1c51ba9b240>](https://localhost:8080/#) in <cell line: 1>()
----> 1 learn.model.save_pretrained("pythia-70m-trained")
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
1693 if name in modules:
1694 return modules[name]
-> 1695 raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
1696
1697 def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:
AttributeError: 'HuggingFaceWrapper' object has no attribute 'save_pretrained'
From reading the source, it looks like the model is encapsulated, not inherited? Which would mean that the desired method is bound to self.model
?
Running the following script in Google Colab:
from fastai.text.all import *
from fastxtend.text.all import *
from datasets import load_dataset
from transformers import AutoModel, AutoTokenizer, DataCollatorWithPadding
dataset = load_dataset("TinyPixel/orca-mini", split='train').train_test_split(test_size=0.3)
model_name = "EleutherAI/pythia-70m"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained('distilroberta-base')
batch_size = 5
data_collator=DataCollatorWithPadding(tokenizer)
train_dataloader = HuggingFaceLoader(
dataset['train'].with_format('torch'), batch_size=batch_size,
collate_fn=data_collator, shuffle=True,
drop_last=True, num_workers=num_cpus()
)
valid_dataloader = HuggingFaceLoader(
dataset['test'].with_format('torch'), batch_size=16,
collate_fn=DataCollatorWithPadding(tokenizer), shuffle=False,
drop_last=False, num_workers=num_cpus()
)
dls = DataLoaders(train_dataloader, valid_dataloader)
learn = Learner(
dls,
model,
loss_func=HuggingFaceLoss(),
cbs=HuggingFaceCallback()
).to_bf16()
learn.model.save_pretrained("pythia-70m-trained")
The documentation is a bit unclear, but currently the HuggingFaceCallback
works by wrapping the Transformers model to handle the incompatible format between fast.ai and Transformers. When training is finished, the Transformers model is automatically unwrapped and available from learn.model
.
Before training is finished, the model needs to be accessed from learn.model.hf_model
. For example, to save a mid-training checkpoint you'd use:
learn.model.hf_model.save_pretrained("pythia-70m-trained")
#22 will resolve this by adding a new attribute for accessing the original Transformers model.
learn.hf_model.save_pretrained("pythia-70m-trained")
The model will no longer be unwrapped automatically after training is over. learn.hf_model
will be created by the HuggingFaceCallback
during after_create
, so it will be accessible from the beginning.