flame / blis

BLAS-like Library Instantiation Software Framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upstream BLIS patches for ARM SVE?

jd-foster opened this issue · comments

We have a multi-platform, cross-compiling build system for binaries in the Julia ecosystem ( BinaryBuilder.jl ), with the Yggdrasil repo hosting the recipes.

While the latest recipe for the shiny BLIS v1.0 is here: JuliaPackaging/Yggdrasil#8626 ,
the original recipe was kindly written by BLIS contributor @xrq-phys who wrote patches for optimisations on ARM SVE architectures. The patches essentially reduce to

-#if !defined(BLIS_FAMILY_A64FX)
+#if 1

in a number of locations (see https://github.com/JuliaPackaging/Yggdrasil/pull/8626/files for the full picture).

So our question is: do we need to keep carrying these patches, or can these be upstreamed to BLIS in any form?

Thanks for any clarification or suggestions here.

IIUC these #if conditions only evaluate to 0 if compiling for A64fx. So, the net effect of the patch would be to essentially disable specific A64fx support? Maybe I'm missing something?

Thanks for the response. I think that's basically the idea.

OK, I think I maybe see where this started and what the intent was. It looks like @xrq-phys's patch added a64fx to the arm64 family, and that he might have been concerned that this would disable the packing kernels for 256-bit ARM SVE (e.g. Graviton3) due to BLIS_FAMILY_A64FX being defined. However, the BLIS_FAMILY_XXX macros for specific architectures are only enabled when that is the only architecture enabled (e.g. configure a64fx). When configuring for generic arm64 architectures then the only family macro defined is BLIS_FAMILY_ARM64. So, the part of the patch dealing with BLIS_FAMILY_A64FX will have no effect.

However, the part about adding a64fx to the arm64 family could be upstreamed.

This is, assuming run-time support for checking for A64fx exists which I'm not sure about. I think A64fx also has some other optimizations relying on specific Fujitsu compiler support which could make it difficult to co-exist with other arm64 configurations.

OK, looks like you understand the reasons (definitely more than I do.) Appreciate you looking into it more deeply.