mavam / libbf

:dart: Bloom filters for C++11

Home Page:http://mavam.github.io/libbf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segmentation fault when running in thread

stmotw opened this issue · comments

I'm trying to use libbf in a multithreaded environment and it fails with a segmentation fault. I've managed to reproduce it in a minimalistic example with docker environment, here it is:

with_thread.cpp

#include "bf/bloom_filter/basic.hpp"
#include <iostream>
#include <thread>

void foo() {
  const bf::hasher h = bf::make_hasher(5, 0, true);
  const auto digests = h(bf::wrap(0));
  for (const auto &val : digests)
    std::cout << val << std::endl;
}

int main()
{
  std::cout << "foo() in main" << std::endl;
  foo();

  std::cout << "foo() in thread" << std::endl;
  std::thread first(foo);
  first.join();

  std::cout << "test completed" << std::endl;

  return 0;
}

Dockerfile

FROM alpine:3.11

COPY with_thread.cpp .

RUN apk update && apk add --no-cache nano mc g++ gdb valgrind git

RUN git clone https://github.com/mavam/libbf.git

RUN g++ with_thread.cpp -o with_thread -g libbf/src/*.cpp libbf/src/bloom_filter/*.cpp -I libbf

SHELL ["/bin/ash", "-c"]

Steps to reproduce

docker build --pull -f Dockerfile -t libbf_alpine .
docker run -it --rm libbf_alpine:latest /bin/ash
./with_thread

Result

foo() in main
0
0
0
0
0
foo() in thread
Segmentation fault

gdb backtrace

#0  0x00005586502aaac7 in bf::make_hasher (k=<error reading variable: Cannot access memory at address 0x7fad73d917e0>,
    seed=<error reading variable: Cannot access memory at address 0x7fad73d917d8>,
    double_hashing=<error reading variable: Cannot access memory at address 0x7fad73d917d4>) at libbf/src/hash.cpp:42
#1  0x00005586502a43ff in foo () at with_thread.cpp:6
#2  0x00005586502a5326 in std::__invoke_impl<void, void (*)()> (__f=@0x7fad73f62ae8: 0x5586502a43c9 <foo()>) at /usr/include/c++/9.2.0/bits/invoke.h:60
#3  0x00005586502a52ca in std::__invoke<void (*)()> (__fn=@0x7fad73f62ae8: 0x5586502a43c9 <foo()>) at /usr/include/c++/9.2.0/bits/invoke.h:95
#4  0x00005586502a5268 in std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul> (this=0x7fad73f62ae8) at /usr/include/c++/9.2.0/thread:244
#5  0x00005586502a5229 in std::thread::_Invoker<std::tuple<void (*)()> >::operator() (this=0x7fad73f62ae8) at /usr/include/c++/9.2.0/thread:251
#6  0x00005586502a51fe in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run (this=0x7fad73f62ae0) at /usr/include/c++/9.2.0/thread:195
#7  0x00007fad73ea8045 in ?? () from /usr/lib/libstdc++.so.6
#8  0x00007fad73fb433e in ?? () from /lib/ld-musl-x86_64.so.1
#9  0x0000000000000000 in ?? ()

valgrind output

==45== Memcheck, a memory error detector
==45== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==45== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==45== Command: ./with_thread
==45==
foo() in main
0
0
0
0
0
foo() in thread
==45==
==45== Process terminating with default action of signal 11 (SIGSEGV)
==45==  Bad permissions for mapped region at address 0x4E567E8
==45==    at 0x110AC7: bf::make_hasher(unsigned long, unsigned long, bool) (hash.cpp:42)
==45==
==45== Process terminating with default action of signal 11 (SIGSEGV)
==45==  Bad permissions for mapped region at address 0x4E567A0
==45==    at 0x489613F: _vgnU_freeres (vg_preloaded.c:83)
==45==
==45== HEAP SUMMARY:
==45==     in use at exit: 74,027 bytes in 10 blocks
==45==   total heap usage: 15 allocs, 5 frees, 221,651 bytes allocated
==45==
==45== LEAK SUMMARY:
==45==    definitely lost: 0 bytes in 0 blocks
==45==    indirectly lost: 0 bytes in 0 blocks
==45==      possibly lost: 0 bytes in 0 blocks
==45==    still reachable: 74,027 bytes in 10 blocks
==45==         suppressed: 0 bytes in 0 blocks
==45== Rerun with --leak-check=full to see details of leaked memory
==45==
==45== For lists of detected and suppressed errors, rerun with: -s
==45== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Whoa. Soon after this issue creation, we have found what caused it.

Alpine distributions use musl instead of glibc. And one of their differences is a thread stack size: https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size

Apparently libbf needs more than default 128k to run. In h3.hpp there is T bytes_[N][byte_range];. And h3 instantion uses T=size_t, N=36, byte_range=256, so bytes_ uses 8 * 36 * 256 bytes = 72Kb of memory.

To resolve this issue we compiled with_thread.cpp with thread stack size increased up to 2Mb (default value in ubuntu based systems):
g++ with_thread.cpp -o with_thread -g libbf/src/*.cpp libbf/src/bloom_filter/*.cpp -I libbf -Wl,-z,stack-size=2097152

Here -Wl,-z,stack-size=<size in bytes> setup thread stack size.

I hope this would help someone, this issue now can be closed.