iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected failure to compile i32 matmul with vanilla flags for llvm-cpu

newling opened this issue · comments

What happened?

I expect iree-compile --iree-hal-target-backends=llvm-cpu my_func.mlir to compile basically any valid IR, so I was quite surprised that a basic matmul (int 128x128x128) failed to compile. I think the constraint is over-restrictive, is there maybe some kind of tiling pass which is needed?

Compile the following IR (matmul_int32.mlir)

!lhs = tensor<128x128xi32>
!rhs = tensor<128x128xi32>
!res = tensor<128x128xi32>

// The function name must match the filename:
func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
  %empty = tensor.empty() : !res
  %cst = arith.constant 0 : i32
  %fill = linalg.fill ins(%cst : i32) outs(%empty : !res) -> !res
  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)
      outs(%fill : !res) -> !res
  return %2 : !res
}

with

iree-compile  --iree-hal-target-backends=llvm-cpu matmul_int32.mlir

And observe

failed to translate executables
test_files/matmul_int32.mlir:15:8: error: One or more operations with large vector sizes (8192 bytes) were found:

  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)
       ^
test_files/matmul_int32.mlir:11:1: note: called from
func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
^
test_files/matmul_int32.mlir:11:1: note:   %16 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %14, %15, %arg7 : vector<8x16xi32>, vector<16x32xi32> into vector<8x32xi32>

func.func @matmul_int32(%lhs : !lhs, %rhs : !rhs) -> !res {
^
test_files/matmul_int32.mlir:15:8: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {cpu = "generic", cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", native_vector_size = 16 : i64, target_triple = "x86_64-unknown-unknown-eabi-elf"}>
  %2 = linalg.matmul ins(%lhs, %rhs : !lhs, !rhs)

Error message introduced in #17620 (@hanhanW)

Steps to reproduce your issue

See above

What component(s) does this issue relate to?

Compiler

Version information

After June 12 2024.

Additional context

No response

Inlining some offline discussion for visibility:

You're missing target cpu features. However, it should still be compiled. I think there is a bug in CPU backend, which picks x86 config for default. We should be able to fix that, it will be very slow though.

I'll get back to it.