triton-lang / triton

Development repository for the Triton language and compiler

Home Page:https://triton-lang.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fp8 tensor core support on h100

yuukidach opened this issue · comments

trition version: 2.3.0

It was mentioned in the issue #3156 that FP8 tensor core can already be used on h100. But when i ran the same example and use ncu to check mma instructions, I found the mma layout was 16816 which does not support fp8 as operands.

  • ncu command: ncu --launch-count 1 --print-source=sass --page source --kernel-name regex:"_attn_fwd_0d1d2d34d5d6de7de8d*" "python3" triton_fused_attention.py
  • output screenshot:
    7d7d5012-7984-4642-9b19-032a71e6d24c

you need to use nightly wheel or build top of tree triton from source.

it sees both nighly wheel and v2.2.0 can run fp8 tensor core correctly