How to build from source if I have different architcure cards?
laoda513 opened this issue · comments
❓ Questions and Help
I originally had four 2080s and have xformerx built, and now I've added an additional four 3090s. In practical operation, I might consider separating them, with the 2080s handling tasks suited for them, and the 3090s handling tasks suited for them.
However, when I run with the 3090s, I encountered an error:
FATAL: kernel fmha_cutlassF_f16_aligned_64x128_rf_sm80 is for sm80-sm100, but was built for sm75.
So, should I just rebuild the Xformer directly? Would this affect the ability of the 2080s to run?
By the way, I noticed that this issue only occurs when I use the format q.shape=(B, M, num_key_value_groups, num_key_value_heads, K) and call memory_efficient_attention. If I use q.shape=(B, M, num_key_value_groups * num_key_value_heads, K), no error occurs.
found solution in doc
set TORCH_CUDA_ARCH_LIST="7.5 8.6"
try it first,sry