plasma-umass / Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Illegal instruction

Zash opened this issue · comments

I get an Illegal instruction crash on Debian testing:

mesh$ ./configure --no-optimize --no-debug
mesh$ make
  CXX   build/src/unit/bitmap_test.o
  CXX   build/src/unit/mesh_test.o
  CXX   build/src/unit/alignment.o
  CXX   build/src/unit/size_class_test.o
  CXX   build/src/unit/binned_tracker_test.o
  CXX   build/src/unit/triple_mesh_test.o
  CXX   build/src/unit/rng_test.o
  CXX   build/src/unit/concurrent_mesh_test.o
  CXX   build/src/vendor/googletest/googletest/src/gtest-all.o
  CXX   build/src/vendor/googletest/googletest/src/gtest_main.o
  CXX   build/src/thread_local_heap.o
  CXX   build/src/global_heap.o
  CXX   build/src/runtime.o
  CXX   build/src/real.o
  CXX   build/src/meshable_arena.o
  CXX   build/src/d_assert.o
  CXX   build/src/measure_rss.o
  LD    unit.test
Running main() from src/vendor/googletest/googletest/src/gtest_main.cc
[==========] Running 22 tests from 8 test cases.
[----------] Global test environment set-up.
[----------] 11 tests from BitmapTest
[ RUN      ] BitmapTest.RepresentationSize
[       OK ] BitmapTest.RepresentationSize (0 ms)
[ RUN      ] BitmapTest.LowestSetBitAt
[       OK ] BitmapTest.LowestSetBitAt (0 ms)
[ RUN      ] BitmapTest.HighestSetBitAt
[       OK ] BitmapTest.HighestSetBitAt (0 ms)
[ RUN      ] BitmapTest.SetAndExchangeAll
[       OK ] BitmapTest.SetAndExchangeAll (0 ms)
[ RUN      ] BitmapTest.SetAll
[       OK ] BitmapTest.SetAll (0 ms)
[ RUN      ] BitmapTest.SetGet
[       OK ] BitmapTest.SetGet (124 ms)
[ RUN      ] BitmapTest.SetGetRelaxed
[       OK ] BitmapTest.SetGetRelaxed (2202 ms)
[ RUN      ] BitmapTest.Builtins
[       OK ] BitmapTest.Builtins (0 ms)
[ RUN      ] BitmapTest.Iter
make: *** [GNUmakefile:135: test] Illegal instruction (core dumped)

@Zash neat! and sorry for the bug. can you provide additional info? in particular what architecture + kernel are you using?

uname -a would be a good start.

Yes, very neat. I was baffled and unsure what might be relevant to post, so I just posted what I had.

uname -a

Linux carcharodon 4.19.0-1-amd64 #1 SMP Debian 4.19.12-1 (2018-12-22) x86_64 GNU/Linux

/proc/cpuinfo
https://gist.github.com/Zash/6ae3bdeb4ce8acd20122712bb3b1bb61

gcc version 8.2.0 (Debian 8.2.0-20)

I tried running it under gdb, here's a "screenshot":
https://gist.github.com/Zash/b5273ec1d3136c6686aa50d948373bd0

From this I guess the illegal instruction in question is vmovss in std::_Hashtable. This makes me unsure whether this is a bug in Mesh at all, rather than in the compiler.

@Zash I bet this is because of these 3 lines in the configure: https://github.com/plasma-umass/Mesh/blob/master/configure#L44-L46

we assume at least nehlem microarchitecture and your CPU appears to be before that (so GCC emits instructions that don't work in hashtable code).

We specifically opted into this because it was necessary to get the bit-counting instructions (like popcount and find-first-set).

I think there are 2 ways to fix this -- we could check in configure, or we could maybe use that Linux ld feature where it can choose between 2 function implementations at runtime (dynamic-link-time) depending on processor features.

Aha! Dropping those fixes it!

[==========] 22 tests from 8 test cases ran. (7041 ms total)
[  PASSED  ] 22 tests.

I had tried removing arch=westmere flags (or changing it to native) earlier without difference.

@Zash glad it is confirmed!

I think the best short term fix is to check in configure something like this:

cat /proc/cpuinfo | egrep '^flags' | grep avx | grep popcnt

and if the result is empty, disable those GCC flags. I will work on that later today

@Zash thanks again for reporting this -- ./configure should do the right thing on master now

Thanks!