Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What params should I tweak in order to prevent the model crashing during training?

AliBharwani opened this issue · comments

I'm trying to train this model on an EC2 instance running a c5.xlarge (4 vCPUs, 8gb of RAM). After setting up the data and preprocessing, I try to train the Tacotron-2 model. It gets as far as printing "Generated 20 test batches of size 32 in 23.134 sec" and then hangs. At this point I've tried sshing from a different terminal but that always freezes, and eventually the training terminal prints "Killed". I'm guessing it's probably bc of the limited resources on the machine. Is there anyway to get around this?

@AliBharwani Use GPU, with more RAM, for example g2.2xlarge

reduce batch_size for tacotron training in hparams=16 from 32