How to build from source if I have different architcure cards?

Question

How to build from source if I have different architcure cards?

laoda513 opened this issue a month ago · comments

❓ Questions and Help

I originally had four 2080s and have xformerx built, and now I've added an additional four 3090s. In practical operation, I might consider separating them, with the 2080s handling tasks suited for them, and the 3090s handling tasks suited for them.

However, when I run with the 3090s, I encountered an error:

FATAL: kernel fmha_cutlassF_f16_aligned_64x128_rf_sm80 is for sm80-sm100, but was built for sm75.

So, should I just rebuild the Xformer directly? Would this affect the ability of the 2080s to run?

By the way, I noticed that this issue only occurs when I use the format q.shape=(B, M, num_key_value_groups, num_key_value_heads, K) and call memory_efficient_attention. If I use q.shape=(B, M, num_key_value_groups * num_key_value_heads, K), no error occurs.

laoda513 · Answer 1 · Mon May 13 2024 15:41:31 GMT+0800 (China Standard Time)

found solution in doc
set TORCH_CUDA_ARCH_LIST="7.5 8.6"

try it first，sry