yitu-opensource / ConvBert

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Train on GPU instead of TPU - differnt distribution strategies

PhilipMay opened this issue · comments

Hi,
many thanks for this nice new model type and your research.
We would like to train a ConvBERT but on GPU and not TPU.
Do you have any experiences or tips how to do this?
We have concerns regarding the differnt distribution strategies
between GPUs and TPUs.

Thanks
Philip

Well - on the README you write:

The code is tested on a V100 GPU.

This means the pretraining on multiple GPUs - right?

Hi, thanks for your interest.
Our code is only tested on a single V100 GPU. If you are seeking support for multi-GPU instead of TPU training, you may refer to https://huggingface.co/transformers/model_doc/convbert.html which implement our model in PyTorch.