CloudI / CloudI

A Cloud at the lowest level!

Home Page:https://cloudi.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nongnu-libunwind 1.2.1 can cause the C/C++ CloudI API to segfault under load

okeuday opened this issue · comments

A simplified setup based on 2.0.6 testing with cloudi_service_request_rate can cause a segfault to occur in the libunwind 1.2.1 source code. The failure only occurs when gcc/g++ creates an optimized build (-O1 or higher). The failure was previously described as a Ubuntu 20.04 bug due to the libunwind Ubuntu package used for gcc/g++ 9.3.0 and 9.4.0 (through the libstdc++ / libgcc dependencies). The Ubuntu 20.04 bug links to the gcc information at https://gcc.gnu.org/pipermail/gcc-help/2022-February/141189.html .

To replicate:

wget https://osdn.net/dl/cloudi/cloudi-2.0.6.tar.gz
tar zxvf cloudi-2.0.6.tar.gz
cd cloudi-2.0.6/src
./configure
make
sudo make install
sudo mv /usr/local/etc/cloudi/cloudi.conf /usr/local/etc/cloudi/cloudi_tests.conf
sudo touch /usr/local/etc/cloudi/cloudi.conf
sudo sh -c 'echo -e "# default scheduler bind type\n+sbt db\n" >> /usr/local/etc/cloudi/cloudi.args'
sudo cloudi start
sudo cloudi attach

With 24 logical processors:

> TestConcurrency = 12.
> {ok, [CReceiverID]} = cloudi_service_api:services_add([[{prefix, "/tests/http_req/"}, {file_path, "/usr/local/lib/cloudi-2.0.6/tests/http_req/http_req_c"}, {dest_refresh, none}, {count_process, TestConcurrency}, {options, [{bind, true}]}]], infinity).
> {ok, [CSenderID]} = cloudi_service_api:services_add([[{prefix, "/tests/http_req/"}, {module, cloudi_service_request_rate}, {args, [{request_rate, dynamic}, {service_name, "/tests/http_req/c.xml/get"}]}, {dest_refresh, lazy_closest}, {count_process, TestConcurrency + 1}, {options, [{duo_mode, true}, {bind, true}]}]], infinity).
% wait for the segfault to occur in less than 2 minutes
> ok = cloudi_service_api:services_remove([CSenderID, CReceiverID], infinity).

An example stacktrace is below:

#0  0x00007f62c1c965f6 in ?? () from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#1  0x00007f62c1c9a4ed in ?? () from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#2  0x00007f62c1c9a9ac in ?? () from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#3  0x00007f62c1c9ad3d in ?? () from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#4  0x00007f62c1c971f4 in _ULx86_64_step ()
   from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#5  0x00007f62c1c95cb1 in __libunwind_Unwind_RaiseException ()
   from /usr/lib/x86_64-linux-gnu/libunwind.so.8
#6  0x00007f62c1b5a69c in __cxa_throw ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f62c201a019 in cloudi_return (api=<optimized out>, 
    request_type=<optimized out>, name=<optimized out>, 
    pattern=<optimized out>, response_info=<optimized out>, 
    response_info_size=<optimized out>, response=0x7ffee18edf50, 
    response_size=40, timeout=3357, 
    trans_id=0x56428ecb6338 "\036\341H\313K\242\031\353\361\021\273s\273\221\026\200\033", source=0x56428ecb634c "\203Xd", source_size=27)
    at /usr/include/c++/9/bits/exception.h:63
#8  0x000056428dc6e592 in request (request_type=1, 
    name=0x56428ecb62e8 "/tests/http_req/c.xml/get", 
    pattern=0x56428ecb6306 "/tests/http_req/c.xml/get", 
    request_info=<optimized out>, request_info_size=<optimized out>, 
    request=<optimized out>, request_size=9, timeout=3357, priority=0 '\000', 
    trans_id=0x56428ecb6338 "\036\341H\313K\242\031\353\361\021\273s\273\221\026
\200\033", source=0x56428ecb634c "\203Xd", source_size=27, state=0x0, 
    api=0x7ffee20eb2d0) at main.c:86
#9  0x00007f62c201bcff in (anonymous namespace)::callback_function::callback_function_c::operator() (this=<optimized out>, request_type=<optimized out>, 
    name=<optimized out>, pattern=<optimized out>, 
    request_info=<optimized out>, request_info_size=<optimized out>, 
    request=0x56428ecb6329, request_size=9, timeout=3357, priority=0 '\000', 
    trans_id=0x56428ecb6338 "\036\341H\313K\242\031\353\361\021\273s\273\221\026\200\033", source=0x56428ecb634c "\203Xd", source_size=27) at cloudi.cpp:197
#10 0x00007f62c201ff4a in (anonymous namespace)::callback_function::operator()
    (source_size=<optimized out>, source=0x56428ecb634c "\203Xd", 
    trans_id=0x56428ecb6338 "\036\341H\313K\242\031\353\361\021\273s\273\221\026\200\033", priority=0 '\000', timeout=3357, request_size=9, 
    request=0x56428ecb6329, request_info_size=0, request_info=0x56428ecb6324, 
    pattern=0x56428ecb6306 "/tests/http_req/c.xml/get", 
    name=0x56428ecb62e8 "/tests/http_req/c.xml/get", request_type=1, 
    this=<synthetic pointer>)
    at /usr/include/boost/smart_ptr/shared_ptr.hpp:726
#11 callback (source_size=<optimized out>, source=0x56428ecb634c "\203Xd", 
    trans_id=0x56428ecb6338 "\036\341H\313K\242\031\353\361\021\273s\273\221\026\200\033", priority=0 '\000', timeout=3357, request_size=9, 
    request=0x56428ecb6329, request_info_size=0, request_info=0x56428ecb6324, 
    pattern=0x56428ecb6306 "/tests/http_req/c.xml/get", 
    name=0x56428ecb62e8 "/tests/http_req/c.xml/get", command=2, 
    api=0x7ffee20eb2d0) at cloudi.cpp:1425
#12 poll_request (api=0x7ffee20eb2d0, timeout=-1, external=<optimized out>)
    at cloudi.cpp:1788
#13 0x000056428dc6e6c8 in process_requests (p=<optimized out>) at main.c:108
#14 0x000056428dc6e29c in main (argc=<optimized out>, argv=<optimized out>)
    at main.c:125

It should be possible to avoid the libunwind bug by using libunwind version 1.6.2 (currently used in Arch Linux ), based on the Ubuntu 20.04 bug information. Three different unwinding libraries are typically used: libgcc, llvm-libunwind and nongnu-libunwind [1], with this problem relating to nongnu-libunwind.

Switching the libunwind used by the C++ stdlib is possible with clang's -unwindlib command-line argument. Ubuntu versions > 20.04 are providing the llvm-libunwind library as a package. GCC doesn't appear to have an option for switching the libunwind used by libstdc++ (through libgcc), so the switch with libstdc++ would require using clang as the compiler.

[1] Let Me Unwind That For You: Exceptions to Backward-Edge Protection, Duta, V.; Freyer, F.; Pagani, F.; Muench, M.; and Giuffrida, C. In NDSS, February 2023. Intel Bounty Reward, page 3

It is possible to avoid this segfault by using clang instead of gcc for the compilation. Using either libunwind (LLVM unwinder) or libgcc (GCC unwinder) as the clang -unwindlib value for link-time results in a compilation that doesn't have the http_req_c binary segfault when performing rapid C++ throw execution. The compilations used are shown below from usage on Ubuntu 20.04 LTS (with its packages):

-unwindlib=libunwind usage below:

export CXX='clang++-10'
export CC='clang-10'
export CXXFLAGS='-I/usr/include/c++/9 -I/usr/include/x86_64-linux-gnu/c++/9'
export LDFLAGS='-rtlib=compiler-rt -unwindlib=libunwind -L/usr/lib/gcc/x86_64-linux-gnu/9'
./configure
make
# ... install and other commands as done previously

-unwindlib=libgcc usage below:

export CXX='clang++-10'
export CC='clang-10'
export CXXFLAGS='-I/usr/include/c++/9 -I/usr/include/x86_64-linux-gnu/c++/9'
export LDFLAGS='-unwindlib=libgcc -L/usr/lib/gcc/x86_64-linux-gnu/9'
./configure
make
# ... install and other commands as done previously

The same nongnu-libunwind library (version 1.2.1) is linked in both compilations for C++ backtrace support, though the execution is able to avoid using nongnu-libunwind for exception handling, which avoids the segfault under load.

The segfault also didn't occur when using a build with gcc on Ubuntu 22.04 LTS in a VirtualBox VM which is using nongnu-libunwind library (version 1.3.2).

Hi, I am the bug submitter.

Note that compilers (e.g. gcc) provide their own internal unwinder. It’s like malloc - you can provide and link your own, and then it replaces the default one. Same with libunwind, which overrides __cxa_throw; gcc provides an internal one if you do not link libunwind.

in my case, libunwind was linked in transitively because glog specified it as a dependency. However, it is entirely optional and glog can use gcc’s internal unwinder too to read backtraces. I had to manually purge (in CMake) libunwind from the link list of my application and dynamic library targets. Then then bug is avoided because gcc’s internal unwinder is used.

@okeuday to summarize: your title is a bit wrong, gcc’s libstdc++ per se does not depend on libunwind, some other 3rd party is pulling it in (it was glog for me) and you should kick it out from being linked in. libunwind is not essential and this version just causes problems.

And my condolences for running into this :) it is truly awful

@ojura Removing libunwind 1.2.1 from the compilation (with gcc/g++) does avoid the segfault. I had been hoping to keep it in the compilation because it had improved the backtrace information in the past but it looks like it is better to keep libunwind out of the compilation to keep the situation simpler and reduce the number of potential problems (it helps to ensure future reliability of CloudI source code).

Just to be clear, libstdc++ uses the libgcc unwinder which segfaults only when (nongnu) libunwind 1.2.1 is part of the compilation (and is linked to the executable) gcc/g++ creates when using optimization levels -O1 or higher. I updated the title to show that libunwind was the problem. Thank you for providing the additional information.