yunjey / show-attend-and-tell

TensorFlow Implementation of "Show, Attend and Tell"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training problem.....

matiqul opened this issue · comments

When I train the system I got the following error. I am using single GPU. How I can solve this problem...?
Traceback (most recent call last):
File "/home/kunolab/Matiqul/Show-Attend-Tell/train.py", line 29, in
main()
File "/home/kunolab/Matiqul/Show-Attend-Tell/train.py", line 10, in main
data = load_coco_data(data_path='./data', split='train')
File "/home/kunolab/Matiqul/Show-Attend-Tell/core/utils.py", line 13, in load_coco_data
data['features'] = hickle.load(os.path.join(data_path, '%s.features.hkl' %split))
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 561, in load
py_container = _load(py_container, h_root_group)
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 674, in _load
py_subcontainer = _load(py_subcontainer, h_node)
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 681, in _load
subdata = load_dataset(h_group)
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 599, in load_dataset
return np.array(data)
MemoryError

commented

I had the same problem earlier but couldn't resolve it and moved ahead. I think it needs larger memory (48GB+). Do you have enough?

This is my GPU configuration. I think enough memory. But Still got this error.....
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90 Driver Version: 384.90 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 00000000:01:00.0 On | N/A |
| 22% 37C P8 16W / 250W | 7760MiB / 12205MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1150 G /usr/bin/X 167MiB |
| 0 2212 G compiz 139MiB |
| 0 6050 C python 7440MiB |
+-----------------------------------------------------------------------------+

Upgrade your memory to 64GB and you will be fine.

@HongyuanL to solve memory error, is there not the other methods?

commented

@SinDongHwan If you work in Linux, you can also increase the swap space.

I have 64GB GeForce GTX1070 machine and I still see the same issue. I have increased the swap space using the instructions on this page (https://askubuntu.com/questions/178712/how-to-increase-swap-space) and still see the same issue.
Has anyone been able to run this on a GPU with similar configuration?

have you guys got any luck in solving the memory error? On which line are you all getting the memory error?

chage type of fearure from float32 to float16, it can save many memory

TangZwei can you explain prepro.py file to me. I mean why are we not simply extracting features only?