iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

assertion failure (cast<>) in LLVMCPULowerExecutableTargetPass

silvasean opened this issue · comments

Describe the bug

Full error log https://gist.github.com/silvasean/e96175f74a8c7833299b8d35e33ebfd2

To Reproduce

iree-compile -hal-target-backends=dylib core-input.mlir

#map = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
module attributes {torch.debug_module_name = "AvgPool2dIntModule"} {
  func @forward(%arg0: tensor<?x?x?x?xi64>) -> tensor<?x?x?x?xi64> {
    %c48_i64 = arith.constant 48 : i64
    %c1_i64 = arith.constant 1 : i64
    %c2_i64 = arith.constant 2 : i64
    %c3 = arith.constant 3 : index
    %c2 = arith.constant 2 : index
    %c1 = arith.constant 1 : index
    %c0 = arith.constant 0 : index
    %c0_i64 = arith.constant 0 : i64
    %0 = tensor.pad %arg0 low[0, 0, 3, 4] high[0, 0, 3, 4] {
    ^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: index):
      tensor.yield %c0_i64 : i64
    } : tensor<?x?x?x?xi64> to tensor<?x?x?x?xi64>
    %1 = tensor.dim %arg0, %c0 : tensor<?x?x?x?xi64>
    %2 = tensor.dim %arg0, %c1 : tensor<?x?x?x?xi64>
    %3 = tensor.dim %arg0, %c2 : tensor<?x?x?x?xi64>
    %4 = tensor.dim %arg0, %c3 : tensor<?x?x?x?xi64>
    %5 = arith.index_cast %3 : index to i64
    %6 = arith.floordivsi %5, %c2_i64 : i64
    %7 = arith.addi %6, %c1_i64 : i64
    %8 = arith.index_cast %7 : i64 to index
    %9 = arith.index_cast %4 : index to i64
    %10 = arith.floordivsi %9, %c2_i64 : i64
    %11 = arith.addi %10, %c1_i64 : i64
    %12 = arith.index_cast %11 : i64 to index
    %13 = linalg.init_tensor [%1, %2, %8, %12] : tensor<?x?x?x?xi64>
    %14 = linalg.fill ins(%c0_i64 : i64) outs(%13 : tensor<?x?x?x?xi64>) -> tensor<?x?x?x?xi64>
    %15 = linalg.init_tensor [6, 8] : tensor<6x8xi64>
    %16 = linalg.pooling_nchw_sum {dilations = dense<1> : vector<2xi64>, strides = dense<2> : vector<2xi64>} ins(%0, %15 : tensor<?x?x?x?xi64>, tensor<6x8xi64>) outs(%14 : tensor<?x?x?x?xi64>) -> tensor<?x?x?x?xi64>
    %17 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%16 : tensor<?x?x?x?xi64>) outs(%13 : tensor<?x?x?x?xi64>) {
    ^bb0(%arg1: i64, %arg2: i64):
      %18 = arith.divsi %arg1, %c48_i64 : i64
      linalg.yield %18 : i64
    } -> tensor<?x?x?x?xi64>
    return %17 : tensor<?x?x?x?xi64>
  }
}

It looks like some operations can't be casted to PartitionableLoopsInterface. I'll take a look at it

Looks like the list of PartitionableLoopsInterface registry doesn't include linalg.pooling_nchw_sum. #9055 should fix it.

Hi @pzread -- #9055 doesn't seem to fix the core issue -- there is an unbounded set of linalg ops, and we cannot have the pass asserting every time a new one comes along. Can we change the code to emit a proper error instead of an assertion failure?

// This is copy-pasted from LinalgStructuredOps.cpp.inc. In theory you could
// just include that generated file here, but that cause errors with bazel.
// The required generated header is not exposed correctly.
// Copy paste is fine for now.

It looks like we can include LinalgStructuredOps.cpp.inc instead. However, the comment states out that there are issues in bazel BUILD. Maybe we can revisit if the solution can be applied?

I mean, can we replace the cast with dyn_cast and get an error message about what needs to be done?

I think we can just replace cast with dyn_cast and return failure() if the cast fails.
https://github.com/google/iree/blob/7f9719876f77d50c3e56ba093b13483def97e705/compiler/src/iree/compiler/Codegen/LLVMCPU/KernelDispatch.cpp#L885

can we replace the cast with dyn_cast and get an error message about what needs to be done

I think that would be good if including LinalgStructuredOps.cpp.inc does not work.

If including LinalgStructuredOps.cpp.inc works, it's guaranteed that all the Linalg ops can be casted to PartitionableLoopsInterface. Then we don't need the check even more Linalg ops are added in the future.

I think that we want to be defensive here -- IREE and its frontends might be built at slightly different LLVM versions -- there is always a chance that something slips in, even if we are building from LinalgStructuredOps.cpp.inc.

I see your point. I think all the root ops should be able to cast to PartitionableLoopsInterface. That's how we define a dispatch in IREE. In this context, all the backends have to query partitionable loops by casting the root op to PartitionableLoopsInterface. We might want to raise the error earlier or have a common check before going to each backend.

My concern now is a layering issue. This should not only be checked in CPU backend, but also have to be checked for all other backends. @MaheshRavishankar can we assume that all the computes ops can be casted into PartionableLoopInterface? If so, maybe we can add the check here:

https://github.com/google/iree/blob/7f9719876f77d50c3e56ba093b13483def97e705/compiler/src/iree/compiler/Codegen/Utils/Utils.cpp#L519-L531

This is the entry point that all the backends get compute ops and try to set configurations.

I'll take a look to see if we can import LinalgStructuredOps.cpp.inc in IREE first or try to have a test to make sure we can capture this when compiling IREE.

I think the best case is we can do the check when compiling IREE and in the tests as the runtime check in IREE might be always missing somewhere.

I think that we want to be defensive here -- IREE and its frontends might be built at slightly different LLVM versions -- there is always a chance that something slips in, even if we are building from LinalgStructuredOps.cpp.inc.

@silvasean I'm not sure if I understand this correctly. I assume if the frontend is built at a different LLVM version and produces some new linalg ops, the LLVM parser in IREE shouldn't accept the IR because there are unrecognized ops. In this case, we will see parsing failures when the frontend calls IREE instead of allowing unknown ops to slip in the compilation pipeline.

@silvasean I'm not sure if I understand this correctly. I assume if the frontend is built at a different LLVM version and produces some new linalg ops, the LLVM parser in IREE shouldn't accept the IR because there are unrecognized ops. In this case, we will see parsing failures when the frontend calls IREE instead of allowing unknown ops to slip in the compilation pipeline.

Good point! We probably don't need to be defensive about that.

With #9062, we no longer have a copy of op list in IREE.