vcmap__concurrent_put_get_remove randomly fails
kilobyte opened this issue · comments
Environment Information
Name | Version |
---|---|
pmemkv version(s) | 1.5.0 |
libpmemobj-cpp version(s) | 1.13.0 |
PMDK (libpmem/libpmemobj) version(s) | 1.11.1 |
OS(es) version(s) | Debian unstable |
kernel version(s) | 5.10 |
TBB version(s): | |
memkind version(s): | 1.12.0 |
ndctl version(s): | 71.1 |
Please provide a reproduction of the bug:
On the Reproducible Builds CI, this test fails pretty often. On my home box, it has never failed despite many many tries.
Link: https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/pmemkv.html (it's live thus it may fail to fail if the CI decides to run again).
-- Executing: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params vcmap;{"path":"/build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/test/vcmap__concurrent_put_get_remove_single_op_params__default_1000_0_none","size":125829120};1000
-- Test vcmap__concurrent_put_get_remove_single_op_params__default_1000_0_none:
-- Stdout:
Signal: Aborted, backtrace:
Signal: Aborted, backtrace:
Signal: Aborted, backtrace:
Signal: Aborted, backtrace:
Signal: Aborted, backtrace:
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
Signal: Aborted, backtrace:
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
Signal: Aborted, backtrace:
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
2: /lib/x86_64-linux-gnu/libc.so.6 (gsignal+0x141) [0x7f18eed6c891]
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (_ZNKSt10error_code23default_error_conditionEv+0x34) [0x7f18eefd3914]
7: /lib/x86_64-linux-gnu/libpthread.so.0 (start_thread+0xe0) [0x7f18ef1b8d80]
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
Signal: Aborted, backtrace:
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
Signal: Aborted, backtrace:
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
1: /lib/x86_64-linux-gnu/libc.so.6 (killpg+0x40) [0x7f18eed6c94f]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
8: /lib/x86_64-linux-gnu/libc.so.6 (clone+0x3f) [0x7f18eee2cba8]
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (_ZNKSt10error_code23default_error_conditionEv+0x34) [0x7f18eefd3914]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
3: /lib/x86_64-linux-gnu/libc.so.6 (abort+0x112) [0x7f18eed56536]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
5: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZL25MultithreadedPutAndRemovemRN4pmem2kv2dbEEUlmE_mEEEEE6_M_runEv+0xbc) [0x555c93c44f4c]
6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (_ZNKSt10error_code23default_error_conditionEv+0x34) [0x7f18eefd3914]
Signal: Aborted, backtrace:
0: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (test_sighandler+0x23) [0x555c93c48633]
4: /build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params (UT_FATAL+0xc1) [0x555c93c44d81]
7: /lib/x86_64-linux-gnu/libpthread.so.0 (start_thread+0xe0) [0x7f18ef1b8d80]
8: /lib/x86_64-linux-gnu/libc.so.6 (clone+0x3f) [0x7f18eee2cba8]
-- Stderr:
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
./tests/engine_scenarios/concurrent/put_get_remove_single_op_params.cc:69 operator() - assertion failure: kv.put(uint64_to_strv(keys[thread_id]), uint64_to_strv(keys[thread_id])) (0x8) == status::OK (0x0), errormsg: [pmemkv_put] std::bad_alloc
CMake Error at /build/1st/pmemkv-1.5.0/tests/helpers.cmake:185 (message):
/build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/tests/concurrent_put_get_remove_single_op_params vcmap;{"path":"/build/1st/pmemkv-1.5.0/obj-x86_64-linux-gnu/test/vcmap__concurrent_put_get_remove_single_op_params__default_1000_0_none","size":125829120};1000 failed: 134
Call Stack (most recent call first):
/build/1st/pmemkv-1.5.0/tests/helpers.cmake:226 (execute_common)
/build/1st/pmemkv-1.5.0/tests/engines/memkind_based/default.cmake:9 (execute)
@kilobyte, how many cores (as reported by nproc
) and free memory are there on this machine?
Damned if I know... I can print that data, what commands would be useful in your opinion?
I believe nproc
and free -g
should be enough
@kilobyte bump - can you pls provide the data I asked for - that'd be appreciated 😃
can you also, please, update memkind to a more recent version (1.14.0 at best)
nproc is 15 is build1, 16 in build2; memory for both: 48G RAM + 200G swap; filesystem: tmpfs 24G
The results are random; currently both runs have succeeded.
I even see a comment on the RB page:
Comments: rpath issue fixed by -DCMAKE_BUILD_RPATH_USE_ORIGIN=ON, but unable to verify effect on test suite due to nondeterministic failures.
Thus others see random fails too.
... so I was about to report that the fail is no more, but it turns out all my recent tests have -E 'vcmap__concurrent_put_get_remove_.*'
thus RB no longer failing is not as good as it seems.