Saving memory to avoid CUDA OOM with GTX-1080Ti
daisukelab opened this issue · comments
Hi,
Thank you very much for sharing this repository, it helps quick try.
But so far I'm struggling to avoid OOM below.
Is there any clue suppressing memory use?
- Using Omniglot dataset.
- Tried DataLoader num_worker=1, but it still shows error.
$ python proto_nets.py --dataset omniglot
omniglot_nt=1_kt=60_qt=5_nv=1_kv=5_qv=1
Indexing background...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19280/19280 [00:00<00:00, 281958.22it/s]
Indexing evaluation...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13180/13180 [00:00<00:00, 302261.60it/s]
Training Prototypical network on omniglot...
Begin training...
Epoch 1: 2%|██▎ | 2/100 [00:06<06:29, 3.98s/it, loss=57.9, categorical_accuracy=0.35]Traceback (most recent call last):
File "proto_nets.py", line 129, in <module>
'distance': args.distance},
File "/home/me/lab/few-shot/ew-shot/few_shot/train.py", line 113, in fit
loss, y_pred = fit_function(model, optimiser, loss_fn, x, y, **fit_function_kwargs)
File "/home/me/lab/few-shot/few_shot/proto.py", line 67, in proto_net_episode
loss.backward()
File "/home/me/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/me/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 316.50 MiB (GPU 0; 10.91 GiB total capacity; 9.70 GiB already allocated; 95.38 MiB free; 245.36 MiB cached)
Hi,
I'm sorry to disturb you once again.
It's always like Murphy's law, I could find what was wrong right after writing this issue post...
It was too big --k-train
, now I could make it run successfully.
$ python proto_nets.py --dataset omniglot --k-train 5
BTW, I like Keras-like API implementation, it's great. :)
Thanks again.
No worries, let me know if you have any more issues.
I recently made the Keras-like API into the pip package olympic
. Docs are here: https://olympic-pytorch.readthedocs.io/en/latest/