SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

failure occured in building at pytorch 1.11.0 / CUDA 11.3 / Win10 / VS2019 error

helonin opened this issue · comments

Thanks for your great job! But i was so sad since the failure occured in building >_<
The Ninja can not generated the file ‘nattenav_cuda.obj’. Please help.
It is the error information.
1

2

Thank you for your interest.
Could you run these and share their outputs?

python3 -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch._C._cuda_getCompiledVersion(), torch.version.cuda)"
nvcc --version

It's basically failing to even start compiling, so it's likely either a torch or CUDA issue.
It's unlikely, but it could be ninja as well. Could you remove ninja and see if it builds?

Thanks for your help! It is the output information.
Snipaste_2022-08-09_09-19-11
And how can I remove the Ninja? I am a beginner on Python >_<

Something seems wrong with the paths here. I notice in the first post it says D:\natten\nattencuda.py and it is failing to find files. I suspect that something is going on here but I'm not very familiar with Windows path environments. Is this intended?

I don't think it is ninja. I'm pretty certain that this is either a CUDA issue or environment issue (probably some intersection). I'm certain given the Runtime Error in the first post. The big issue here is I don't know where Windows is caching builds. According to stylegan 3's troubleshooting guide it should be located at :\Users\<username>\AppData\Local\torch_extensions\torch_extensions\Cache, so you should clear any reference to natten there (should be safe to clear everything)

@helonin can you edit gradcheck.py, place the import torch line above the natten import (@alihassanijr we should also change this btw. Our imports should be last) and directly below print your python info? So like this

import torch
print(f"torch {torch.__version__} and cuda {torch.version.cuda}")
from nattencuda import NATTENAVFunction, NATTENQKRPBFunction

This should verify that the file sees the correct torch and cuda versions (I suspect it isn't). Let's see the output of that.

But if you want to uninstall ninja you can just do so through pip.

I did everythin following your guide but anather error occured.
Snipaste_2022-08-09_09-19-11

Could you removing the cache directory that @stevenwalton mentioned (you could alternatively set your TORCH_EXTENSIONS_DIR env variable to somewhere else), remove ninja (pip uninstall ninja) and try again?

It is possible that this is a Windows issue? I'm seeing that Python 3.8 only loads DLLs from trusted locations.. @helonin , what version of Python are you using? Does this Overflow link help?

the version of my python is 3.7.10. I have set the TORCH_EXTENSIONS_DIR env variable but occured the same problem. >_<.
I have give up trying in Windows and will try to install the NAT in Ubuntu soon. Thank you all the same!

Snipaste_2022-08-09_09-19-11

Still looks like a environment variable issue. I think you should track down where TORCH_EXTENSIONS_DIR points to as well as where you're allowed to read files from (as per the stack overflow link).

For Ubuntu, note that TORCH_EXTENSIONS_DIR is at ~/.cache/torch_extensions. The path won't exist till you build something.

The work finished successly in Ubuntu!
Thank you all the same!

I'll close this issue for now but feel free to open it back up. We do need to test more on Windows.