CUDA memory usage high for large --num_decode

Question

CUDA memory usage high for large --num_decode

JamesALumley opened this issue 4 years ago · comments

Memory usage is high when generating large numbers of output smiles from a single input. Perhaps the intent is to only generate a few output smiles based on a single input? But when generating large numbers of smiles from a single input, the code has significant usability issues as per below error (reliance on memory bound data structure). In the below case the model was trained on ~20K mols -> 100K pairs and theoretically should have enough diversity to generate a large number of changes to the input smiles:

python decode.py --test single_smiles.smi --vocab training.vocab --model ./models/model.10 --num_decode 10000 --batch_size 1

Traceback (most recent call last):
File "../hgraph2graph/decode.py", line 69, in
new_mols = model.translate(batch[1], args.num_decode, args.enum_root, args.greedy)
File "/hpc/scratch/nvme1/HeirVAE/hgraph2graph/hgraph/hgnn.py", line 96, in translate
return self.decoder.decode( (root_vecs, z_tree_vecs, z_graph_vecs), greedy=greedy)
File "/hpc/scratch/nvme1/HeirVAE/hgraph2graph/hgraph/decoder.py", line 322, in decode
hinter = HTuple( mess = self.rnn_cell.get_init_state(tree_tensors[1]) )
File "/hpc/scratch/nvme1/HeirVAE/hgraph2graph/hgraph/rnn.py", line 76, in get_init_state
c = torch.zeros(len(fmess), self.hidden_size, device=fmess.device)
RuntimeError: CUDA out of memory. Tried to allocate 2.01 GiB (GPU 0; 10.92 GiB total capacity; 8.94 GiB already allocated; 1.29 GiB free; 9.21 GiB reserved in total by PyTorch)

Wengong Jin · Answer 1 · Wed Jul 22 2020 03:49:29 GMT+0800 (China Standard Time)

Hi,

When you have --num_decode 10000, the code will create a large batch of batch_size 10000. It's hard for the model to decode 10000 molecules in one batch. Therefore you need to split 10000 decoding attempts into several batches.