lmb-freiburg / flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Home Page:https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't train network on multiple GPUs

adhara123007 opened this issue · comments

Hi,

I am trying to train the network on multiple GPUs but I get the error:
F0321 13:50:58.896466 271 parallel.cpp:55] Check failed: total_size == (ptr == buffer ? 1 : ptr - buffer) (118335438 vs. 117426126)
*** Check failure stack trace: ***
@ 0x7fb6d90315cd google::LogMessage::Fail()
@ 0x7fb6d9033433 google::LogMessage::SendToLog()
@ 0x7fb6d903115b google::LogMessage::Flush()
@ 0x7fb6d9033e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fb6d96cf8fd caffe::GPUParams<>::configure()
@ 0x7fb6d96cfd4b caffe::P2PSync<>::P2PSync()
@ 0x7fb6d96d1672 caffe::P2PSync<>::Prepare()
@ 0x7fb6d96d1cde caffe::P2PSync<>::Run()
@ 0x40a80f train()
@ 0x4075b8 main
@ 0x7fb6d7ae5830 __libc_start_main
@ 0x407d29 _start
@ (nil) (unknown)

If I am using one GPU (any of the two GPUs in the system), everything seems to run okay.

Sorry, we never used Caffe on multiple GPUs and I have no experience with that.

Thanks