rusty1s / pytorch_cluster

PyTorch Extension Library of Optimized Graph Cluster Algorithms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gpu execution error

twangnh opened this issue · comments

commented

I used the nearest function (ie. from torch_cluster import nearest), but when the input is on cpu, the result is correct, when the input is on gpu, the result is completely run, with very large returned indexes. Could you please help? thanks in advance!

Do you have a minimal example to reproduce?

commented
import torch
from torch_cluster import nearest

torch.manual_seed(12345)

A = torch.randn(100, 3).cuda()
B = torch.randn(100, 3).cuda()
inds = nearest(B, A)

inds will have our of range numbers that is very large
I'm also interested in how fast CUDA should be faster than CPU?

That's interesting. The following works for me:

import torch
from torch_cluster import nearest

torch.manual_seed(12345)

A = torch.randn(100, 3)
B = torch.randn(100, 3)
inds1 = nearest(B, A)
inds2 = nearest(B.cuda(), A.cuda())
assert torch.equal(inds1, inds2.cpu())

May I ask how you installed torch-cluster on your system?

commented

It is strange that the code works for me on the machine that I installed torch_cluster, but it cannot work on other machines (we have a small cluster of machines that shares a same home directory, where I installed the anaconda environment. Every machine can access the same python environment so the python pkgs installed from one machine is shared by all machines), all other python pkgs I installed before can work in this way.

I install it with pip install torch-cluster -f https://data.pyg.org/whl/torch-1.8.1+cu101.html

commented

I find it works also on other machines with the same GPU (TITAN RTX), but not on older ones like GeForce GTX TITAN X, Could it be due to some GPU architecture support settings during installation?

commented

To try support more GPU models, I tried to git clone https://github.com/rusty1s/pytorch_cluster and add the following in setup.py

        if suffix == 'cuda':
            define_macros += [('WITH_CUDA', None)]
            nvcc_flags = os.getenv('NVCC_FLAGS', '')
            nvcc_flags = [] if nvcc_flags == '' else nvcc_flags.split(' ')
            nvcc_flags += ['--expt-relaxed-constexpr', '-O2']
            nvcc_flags += ["-arch=sm_60",
                "-gencode=arch=compute_60,code=sm_60",
                "-gencode=arch=compute_61,code=sm_61",
                "-gencode=arch=compute_70,code=sm_70",
                "-gencode=arch=compute_75,code=sm_75",]
            extra_compile_args['nvcc'] = nvcc_flags

then install from source by :
pip install -e pytorch_cluster
it is installed sucessfully, however, when importing torch_cluster it raises:

>>> from torch_cluster import nearest
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wangtao/prj/pytorch_cluster/torch_cluster/__init__.py", line 45, in <module>
    from .rw import random_walk  # noqa
  File "/home/wangtao/prj/pytorch_cluster/torch_cluster/rw.py", line 8, in <module>
    def random_walk(
  File "/home/wangtao/anaconda3_2/envs/deform_seg_env/lib/python3.8/site-packages/torch/jit/_script.py", line 989, in script
    fn = torch._C._jit_script_compile(
RuntimeError:
General Union types are not currently supported. Only Union[T, NoneType] (i.e. Optional[T]) is supported.:
  File "/home/wangtao/prj/pytorch_cluster/torch_cluster/rw.py", line 18
    num_nodes: Optional[int] = None,
    return_edge_indices: bool = False,
) -> Union[Tensor, Tuple[Tensor, Tensor]]:
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    """Samples random walks of length :obj:`walk_length` from all node indices
    in :obj:`start` in the graph given by :obj:`(row, col)` as described in the

Can you try to install via pip install --no-index torch-cluster -f https://data.pyg.org/whl/torch-1.8.1+cu101.html (note the --no-index needed for older PyTorch versions)? I guess you are currently building from source due to the old PyTorch version. I am confident pre-built wheels should support a variety of architectures.

commented

sorry for the delayed response, it works by using pip install --no-index torch-cluster -f https://data.pyg.org/whl/torch-1.8.1+cu101.html, thanks for your help!

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?