opencog / cogutil

Very low-level C++ programming utilities used by several components

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LoggerUTest fails

mhatta opened this issue · comments

Currently, LoggerUTest got SegFault and fails on Debian sid. 8/31 snapshot (efa8e3b) can be built and pass the test in the same environment, so recent commits broke something.

Did you do a make; make install first, before running the unit tests? Can you post a stack trace? There's no problem on Debian stable.

These are what I got.

$ cd build/tests/util
$ ./LoggerUTest
unning cxxtest tests (6 tests)..[1]    6813 segmentation fault  ./LoggerUTest
$ gdb ./LoggerUTest
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
(snip)
(gdb) run
Starting program: /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/tests/util/LoggerUTest 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running cxxtest tests (6 tests)[New Thread 0x7f8808c3f700 (LWP 6846)]
[New Thread 0x7f880843e700 (LWP 6847)]
[Thread 0x7f880843e700 (LWP 6847) exited]
.[New Thread 0x7f8808c3f700 (LWP 6848)]
[Thread 0x7f8808c3f700 (LWP 6846) exited]
.[New Thread 0x7f880843e700 (LWP 6849)]
[Thread 0x7f880843e700 (LWP 6849) exited]
[New Thread 0x7f880843e700 (LWP 6850)]

Thread 1 "LoggerUTest" received signal SIGSEGV, Segmentation fault.
0x00007f880c626500 in _bfd_elf_find_function ()
   from /usr/lib/x86_64-linux-gnu/libbfd-2.29-system.so
(gdb) bt            
#0  0x00007f880c626500 in _bfd_elf_find_function ()                             
   from /usr/lib/x86_64-linux-gnu/libbfd-2.29-system.so                         
#1  0x00007f880c6025b8 in _bfd_elf_find_nearest_line ()                         
   from /usr/lib/x86_64-linux-gnu/libbfd-2.29-system.so                         
#2  0x00007f880d2b53d4 in find_address_in_section ()                            
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#3  0x00007f880c5df4dc in bfd_map_over_sections ()                              
   from /usr/lib/x86_64-linux-gnu/libbfd-2.29-system.so                         
#4  0x00007f880d2b54fe in translate_addresses_buf ()                            
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#5  0x00007f880d2b56d1 in process_file ()                                       
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#6  0x00007f880d2b58d2 in oc_backtrace_symbols ()                               
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#7  0x00007f880d2dee76 in prt_backtrace(std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&) ()
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
---Type <return> to continue, or q <return> to quit---
#8  0x00007f880d2e0020 in opencog::Logger::log(opencog::Logger::Level, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#9  0x00007f880d2e02d3 in opencog::Logger::logva(opencog::Logger::Level, char const*, __va_list_tag*) ()
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#10 0x00007f880d2e047d in opencog::Logger::Error::operator()(char const*, ...) ()
   from /home/mhatta/work/Debian/opencog/opencog-cogutils/opencog-cogutils-2.0.3~git20170905.7b0b6f9/build/opencog/util/libcogutil.so
#11 0x000055555556c53b in LoggerUTest::logAllLevels(char const*, opencog::Logger::Level, unsigned int) ()
#12 0x000055555556c7ac in LoggerUTest::testLevels() ()
#13 0x000055555556d526 in TestDescription_suite_LoggerUTest_testLevels::runTest() ()
#14 0x000055555556632c in CxxTest::RealTestDescription::run() ()
#15 0x0000555555569c5e in CxxTest::TestRunner::runTest(CxxTest::TestDescription&) ()
#16 0x0000555555569b72 in CxxTest::TestRunner::runSuite(CxxTest::SuiteDescription&) ()
#17 0x0000555555569a46 in CxxTest::TestRunner::runWorld() ()
#18 0x00005555555698e7 in CxxTest::TestRunner::runAllTests(CxxTest::TestListener&) ()
#19 0x000055555556a014 in CxxTest::ErrorFormatter::run() ()
#20 0x000055555556df28 in int CxxTest::Main<CxxTest::ErrorPrinter>(CxxTest::ErrorPrinter&, int, char**) ()
#21 0x00005555555649ac in main ()
(gdb) 

LoggerUTest built from git hash efa9e3b works. So I guess something introduced between efa9e3b and 7b0b6f9. Sorry I can't help much.

$ ./LoggerUTest 
Running cxxtest tests (6 tests)....resline = [2017-09-05 17:29:51:035] [DEBUG] [LoggerUTest] message
.resline = [2017-09-05 17:29:51:039] [DEBUG] [LoggerUTest] message
.OK!

Are you sure you rebuilt the code? That is exactly the stack trace I was getting before the fix. In the old code, two different threads would enter find_address_in_section() at the same time, and because of the globals in that file, there would be inconsistent information sent to bfd_elf_find_function(), causing it to crash. The new code avoids the global variables.

My code, which used to crash every half-hour, before the fix, has now run fine for 24 hours ...

Its possible that the bfd code itself is not thread-safe; adding a lock would fix that. Is it possible that the bfd code in debian unstable is different from that in debian stable?

I'm pretty sure the old code is not used in any way. Just in case, I purged the installed libcogutil-dev package before the build (so no cogutils code exists in my environment), but the binary still got segfaults. And as I said, the binary built now, but from source circa 1 weeks ago still works on the same environment, so I guess something introduced fairly recently.

The version of binutils in Debian unstable is 2.29. Debian stretch seems to have 2.28.

https://tracker.debian.org/pkg/binutils

Konstantin claims that pull req #111 fixes this, and that seems like a plausible claim, so I'm closing this. If you hit this again, please open a new bug.

Sorry to have argued so much; when I can't reproduce a bug, it becomes really hard to fix.