Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cache detectiong on arm/arm64

Danliran opened this issue · comments

Hi NNPACK team,

The function of init_hwinfo detect the hardware cache info, but if the platform is ARM/ARM64, the cache info is hard code in the function. I think we should detect these info from system or cpuinfo .

#if !(CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64) || defined(ANDROID)
static void init_static_hwinfo(void) {
nnp_hwinfo.cache.l1 = (struct cache_info) {
.size = 16 * 1024,
.associativity = 4,
.threads = 1,
.inclusive = true,
};
nnp_hwinfo.cache.l2 = (struct cache_info) {
.size = 128 * 1024,
.associativity = 4,
.threads = 1,
.inclusive = true,
};
nnp_hwinfo.cache.l3 = (struct cache_info) {
.size = 2 * 1024 * 1024,
.associativity = 8,
.threads = 1,
.inclusive = true,
};
}
#endif

@Maratyszcza Should we need to implement a function for arm linux hwinfo. Some vendor designed CPU based on ARM core.

`static void init_hwinfo(void) {
#if (CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64) && !defined(ANDROID)
init_x86_hwinfo();
#elif !CPUINFO_ARCH_X86 && !CPUINFO_ARCH_X86_64 && defined(APPLE)
init_static_ios_hwinfo();
#elfi CPUINFO_ARCH_ARM || CPUINFO_ARCH_ARM64
init_arm_linux_hwinfo();
#else
init_static_hwinfo();
#endif
............

}`

NNPACK assumes 3-level cache hierarchy, but many ARM CPUs have only two levels of cache. Thus, adapting NNPACK to use the actual cache parameters is not straightforward, and as I don't actively work on NNPACK anymore, there are no plans to introduce two-level cache blocking.