bharatsingh430 / py-R-FCN-multiGPU

Code for training py-faster-rcnn and py-R-FCN on multiple GPUs in caffe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem of multi-GPU "'NCCL' has no attribute 'new_uid'"

Xiangyu-CAS opened this issue · comments

Hi, I had clone your code and install NCCL. However , when run the file train_multi_gpu.py, I encounter a error

File "/home/bo718.wang/xiangyu.zhu/py-R-FCN-multiGPU-soft-nms/tools/../lib/fast_rcnn/train_multi_gpu.py", line 205, in train_net_multi_gpu
uid = caffe.NCCL.New_Uid()
AttributeError: type object 'NCCL' has no attribute 'New_Uid'

To make sure I had install NCCL succesfully I type in the commends as follow

caffe.NCCL
<class 'caffe._caffe.NCCL'>
caffe.NCCL.new_uid
Traceback (most recent call last):
File "", line 1, in
AttributeError: type object 'NCCL' has no attribute 'new_uid'

caffe.NCCL is successfully installed but can not find new_uid in NCCL
Could you help me figure it out?Thks very much

change gpu mumber in the file rfcn_end2end_ohem_multi_gpu.sh ,the source code is 8 gpu
ime ./tools/train_net_multi_gpu.py --gpu ${GPU_ID}
--solver models/${PT_DIR}/${NET}/rfcn_end2end/solver_ohem.prototxt
--weights data/imagenet_models/${NET}-model.caffemodel
--imdb ${TRAIN_IMDB}
--iters ${ITERS}
--cfg experiments/cfgs/rfcn_end2end_ohem_${PT_DIR}.yml
${EXTRA_ARGS}

HI Xiangyu, have you solved the problem? I have the same problem as you.

@Xiangyu-CAS Have you solved the problem? I have the same problem as you.

has anyone installed the caffe in this repo successfully?

You have to export the path to the NCCL build. I made a local install so i exported it to my home
export LD_LIBRARY_PATH=/home/[username]/nccl/build/lib:$LD_LIBRARY_PATH