Can CUDA 12.1.1 really be used for compilation?
leizhao1234 opened this issue · comments
I use cuda 12.1.1 to build TE form source, stable、main and v1.3 branch, all of them can install successfully, but flash-attention installed by TE doesn’t work at all.
import flash_attn_2_cuda as flash_attn_cuda ImportError: /share/home/zl/miniconda/envs/mega/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
The undefined symbol actually comes from pyTorch, the CUDA version seems unrelated.
@ptrblck Do you have some recommendations on debugging this symbol issue?
It should have nothing to do with PyTorch. If I compile FlashAttention directly from the source code, there’s no issue, but when I install FlashAttention through TE, I encounter undefined symbols.
This is related to the version of flash-attn
. TE currently forces flash-attn<=2.4.2
and there seems to be some issue with v2.4.2
. Installing from source or installing v2.5.5
can help.
See #689