salesforce / ctrl

Conditional Transformer Language Model for Controllable Generation

Home Page:https://arxiv.org/abs/1909.05858

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running the model on TPUs?

vessenes opened this issue · comments

Hi,

I have the 256 and 512 models working on GCP with a Tesla V100. Text generates, but slowly, and I'm wanting to get faster generation out of the system. I thought running CTRL on TPUs could get me faster text, but I have no idea how to do that.

Do you have an incantation or pointer that would let me point CTRL at a TPU?

Second this!

I haven't quite figured out how to get TPUs to be faster than GPUs for inference. I'll probably look into this soon. It's especially more complicated with top-k/nucleus sampling and other add-ons. Seems like others have found the same behavior.