iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing f32->bf16 demotion support for the targets of data-tiling ops

hanhanW opened this issue · comments

The pass only looks at some named ops. We could have other target operations in generic form or other named ops. We should generalize the pass.

https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/GlobalOptimization/DemoteContractionInputsToBF16.cpp

it may do different things (as it also changes public ABI), but maybe adding an f32->bf16 to https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/InputConversion/Common/ConvertPrimitiveType.cpp#L308 could help?
or, if it works but does more than you want you could at least take some of the code from it that handles ops more generically

Oh, this is mostly a quick experimental flag. Sometimes we want to conditionally select some ops (e.g., contraction ops) and demote their input operands from fp32 to bf16 types. It helps unblock the work when bf16 models are not ready. We can start some work and estimation using fp32 models with the flag.