WXinlong / ASIS

Associatively Segmenting Instances and Semantics in Point Clouds, CVPR 2019

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

about test.py

lapetite123 opened this issue · comments

when I run test.py, it raises some error, I don't know what the exactly problem it is, can you please help me?
Model restored.
0 / 68 ...
Loading train file /home/ASIS/data/stanford_indoor3d_ins.sem/Area_5_conferenceRoom_1.npy
Processsing: Shape [0] Block[0]
2019-09-14 18:39:28.811094: E tensorflow/stream_executor/cuda/cuda_blas.cc:636] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "test.py", line 262, in
test()
File "test.py", line 183, in test
feed_dict=feed_dict)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'layer1/conv0/Conv2D', defined at:
File "test.py", line 262, in
test()
File "test.py", line 79, in test
pred_sem, pred_ins = get_model(pointclouds_pl, is_training_pl, NUM_CLASSES)
File "/home/ASIS/models/ASIS/model.py", line 29, in get_model
l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=1024, radius=0.1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
File "/home/ASIS/utils/pointnet_util.py", line 187, in pointnet_sa_module
data_format=data_format)
File "/home/ASIS/utils/tf_util.py", line 165, in conv2d
data_format=data_format)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Annother question: I can run train.py and estimate_mean_ins_size.py successfully, but I was wondering,that why train.py just takes about several minutes to train.
waiting for your kindly response! thanks very much

Please ask how your problem was resolved, I have encountered the same situation and look forward to getting your reply.