NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Home Page:https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

main branch cannot compile due to incompatibility with the main branch of cudnn-frontend

lucifer1004 opened this issue · comments

It seems that there are some breaking API changes in the main branch of cudnn-frontend. This cause the compilation of TE's main branch to fail.

Some of the error messages:

 [8/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o
    FAILED: common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o
    /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -Dtransformer_engine_EXPORTS -I/home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine -I/home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine/common/include -I/usr/local/cuda/targets/x86_64-linux/include -I/home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/tmp8ddwux9l/common/string_headers -isystem=/usr/local/cuda/include --threads 4 --expt-relaxed-constexpr -O3 -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_89,code=[compute_89,sm_89] --generate-code=arch=compute_90,code=[compute_90,sm_90] -Xcompiler=-fPIC -std=c++17 -MD -MT common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o -MF common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o.d -x cu -c /home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine/common/fused_attn/utils.cu -o common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o
    /home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine/common/fused_attn/utils.h(114): error: namespace "cudnn_frontend" has no member "DataType_t"
        cudnn_frontend::DataType_t tensor_type;
                        ^

    /home/xxx/com.github/NVIDIA/TransformerEngine/transformer_engine/common/fused_attn/utils.h(142): error: namespace "cudnn_frontend" has no member "DataType_t"
      cudnn_frontend::DataType_t get_cudnn_fe_dtype(const transformer_engine::DType t);

We currently pin the cuDNN front-end to the 1.0.3 release. I don't expect to see much benefit from updating to the bleeding edge since it is mostly just a wrapper around the main cuDNN library and doesn't affect functionality or performance.

I do reproduce build errors starting with the 1.1.0 release, but the error messages are related to cudnn_frontend::throw_if instead of cudnn_frontend::DataType_t. If you need the latest cuDNN front-end, can you try building with #696?

Hi @timmoon10 , I do not need the latest cudnn-frontend, but just tried to build the main branch and failed.