TengdaHan / DPC

Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimizing dataloading

ruoshiliu opened this issue · comments

Hi there,

Thanks for sharing the code. When training DPC, I found that the data-loading process is taking very long time especially for large dataset. I'm wondering if you have any suggestions regarding how to optimize the dataloader in DPC?

Thanks!

Hi, in our setting, we store video frames in SSD, and the data loading can catch up with the GPU computing (GPU utilization close to 100%).
Alternatively, you can try LMDB to store video frames and it will be faster in HDD as well.
Where do you store the frames? SSD or HDD?

I just moved the data from HDD to SSD and it indeed boosted the GPU utilization by a big margin. The lesson learned is that some alternative data format like LMDB would be necessary for HDD disk I/O. In my case, using ssd shortens an epoch by ~2/3

Thank you for your prompt response!