ucinlp / autoprompt

AutoPrompt: Automatic Prompt Construction for Masked Language Models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A question for the meaning of letter 'Ġ'

jasonyin718 opened this issue · comments

Hello Taylor :
Thank you again for solving my question last time, with the command you provided, now I can run the create_trigger.py normally. Recently, I have another question about the setting of the verbalizer for sentiment analysis. In the command you provided, all the words of the verbalizer begin with the letter 'Ġ', I don't know the meaning of it, so I try to remove the letter 'Ġ' in order to figure out the meaning of it. However, after I removed the letter 'Ġ' and run create_trigger.py, the program goes wrong and I can't obtain the "Best dev metrics'. In summary, could you please explain the meaning or usage of letter 'Ġ' to me?
Wishing for your reply!

If the "##" (BERT) and "Ġ" (RoBERTa) symbols are prepended to a token that means that the rest of the token should be attached to the previous one, without space. I'd recommend reading the following articles

https://huggingface.co/docs/transformers/tokenizer_summary
https://huggingface.co/course/chapter6/5?fw=pt