warning information

Question

warning information

yz-qiang opened this issue a year ago · comments

CodeGen is a powerful model.

When I use the model as the following code:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-350M-mono")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono")

text = "def hello_world():"
input_ids = tokenizer(text, return_tensors="pt").input_ids

generated_ids = model.generate(input_ids, max_length=128)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

However, here has some warning information:

The attention mask and the pad token id were not set.  As a consequence, you may observe unexpected behavior.  Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

Do you know how I can fix it? Plus, what happens if I don't fix it?

Thank you very much!

simon gao · Answer 1 · Wed Mar 15 2023 23:15:42 GMT+0800 (China Standard Time)

added pad_token_id = 50256
such as :

pad_token_id = 50256
set_seed(42, deterministic=True)
device = torch.device('cuda:0')
... ...