Unexpected failure to compile i32 matmul with vanilla flags for llvm-cpu

Question

Unexpected failure to compile i32 matmul with vanilla flags for llvm-cpu

newling opened this issue a month ago · comments

What happened?

I expect iree-compile --iree-hal-target-backends=llvm-cpu my_func.mlir to compile basically any valid IR, so I was quite surprised that a basic matmul (int 128x128x128) failed to compile. I think the constraint is over-restrictive, is there maybe some kind of tiling pass which is needed?

Compile the following IR (matmul_int32.mlir)

!lhs = tensor<128x128xi32>
!rhs = tensor<128x128xi32>
!res = tensor<128x128xi32>

// The function name must match the filename:
func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
  %empty = tensor.empty() : !res
  %cst = arith.constant 0 : i32
  %fill = linalg.fill ins(%cst : i32) outs(%empty : !res) -> !res
  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)
      outs(%fill : !res) -> !res
  return %2 : !res
}

with

iree-compile  --iree-hal-target-backends=llvm-cpu matmul_int32.mlir

And observe

failed to translate executables
test_files/matmul_int32.mlir:15:8: error: One or more operations with large vector sizes (8192 bytes) were found:

  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)
       ^
test_files/matmul_int32.mlir:11:1: note: called from
func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
^
test_files/matmul_int32.mlir:11:1: note:   %16 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %14, %15, %arg7 : vector<8x16xi32>, vector<16x32xi32> into vector<8x32xi32>

func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
^
test_files/matmul_int32.mlir:15:8: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {cpu = "generic", cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", native_vector_size = 16 : i64, target_triple = "x86_64-unknown-unknown-eabi-elf"}>
  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)

Error message introduced in #17620 (@hanhanW)

Steps to reproduce your issue

See above

What component(s) does this issue relate to?

Compiler

Version information

After June 12 2024.

Additional context

No response

Han-Chung Wang · Answer 1 · Thu Jun 20 2024 01:22:31 GMT+0800 (China Standard Time)

Inlining some offline discussion for visibility:

You're missing target cpu features. However, it should still be compiled. I think there is a bug in CPU backend, which picks x86 config for default. We should be able to fix that, it will be very slow though.

I'll get back to it.