salesforce / ctrl

Conditional Transformer Language Model for Controllable Generation

Home Page:https://arxiv.org/abs/1909.05858

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

multiple tags as control code

leejason opened this issue · comments

Does the following mean one training record for each tag of the multiple tags? Say, if my average number of multiple tags is 10, the data size for fine-tuning will become 10x of the original? Is this understanding correct? If correct, I plan to give it a try & thank you for your advice.

The way it's trained, the current checkpoints don't support that. However, there is nothing preventing one from fine-tuning (or re-training) CTRL to do that. I'm fairly sure that the model will learn to pick it up.

Originally posted by @keskarnitish in #33 (comment)

It's up to you really; it depends on what you want to do at the end.

If it is a hierarchy (like, [Books, Author, Title]), then you don't need to replicate the data.
If it is a label for the data but the data has multiple labels (like, Wikipedia Stoicism is ... and Philosophy Stoicism is..), then you probably would benefit from the replication.

Closing for now, reopen as needed.