[BUG] ASAN issues -- unsynchronized reads
BenFrantzDale opened this issue · comments
Describe the bug
g++ -fsanitize=thread -Iinclude ./tests/BS_thread_pool_test.cpp -std=c++17 -O3 -Wall -Wextra -Wconversion -Wsign-conversion -Wpedantic -Weffc++ -Wshadow -pthread -o BS_thread_pool_test && TSAN_OPTIONS="halt_on_error=1" ./BS_thread_pool_test
Output:
...
WARNING: ThreadSanitizer: data race (pid=34544)
Read of size 8 at 0x7ffe64bfb3c8 by thread T73:
#0 BS::thread_pool::worker(unsigned int, std::function<void ()> const&) <null> (BS_thread_pool_test+0x4ae81) (BuildId: e41652891611ed6c54197009524a2c7d2dd95798)
#1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (BS::thread_pool::*)(unsigned int, std::function<void ()> const&), BS::thread_pool*, unsigned int, std::function<void ()> > > >::_M_run() <null> (BS_thread_pool_test+0x44397) (BuildId: e41652891611ed6c54197009524a2c7d2dd95798)
#2 <null> <null> (libstdc++.so.6+0xecdb3) (BuildId: ca77dae775ec87540acd7218fa990c40d1c94ab1)
Previous write of size 8 at 0x7ffe64bfb3c8 by thread T68 (mutexes: write M0):
#0 BS::thread_pool::worker(unsigned int, std::function<void ()> const&) <null> (BS_thread_pool_test+0x4a3d6) (BuildId: e41652891611ed6c54197009524a2c7d2dd95798)
#1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (BS::thread_pool::*)(unsigned int, std::function<void ()> const&), BS::thread_pool*, unsigned int, std::function<void ()> > > >::_M_run() <null> (BS_thread_pool_test+0x44397) (BuildId: e41652891611ed6c54197009524a2c7d2dd95798)
#2 <null> <null> (libstdc++.so.6+0xecdb3) (BuildId: ca77dae775ec87540acd7218fa990c40d1c94ab1)
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x1e3c8)
Mutex M0 (0x7ffe64bfb3d0) created at:
#0 pthread_mutex_lock ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1341 (libtsan.so.2+0x59a13) (BuildId: 38097064631f7912bd33117a9c83d08b42e15571)
#1 BS::thread_pool::thread_pool(unsigned int, std::function<void ()> const&) <null> (BS_thread_pool_test+0x49650) (B
...
In void worker(const concurrency_t idx, const std::function<void()>& init_task)
, there's a data race:
std::unique_lock tasks_lock(tasks_mutex);
while (true)
{
--tasks_running;
tasks_lock.unlock();
// Right here, we read waiting, tasks_running, and worst of all,
// BS_THREAD_POOL_PAUSED_OR_EMPTY expands to code that includes tasks.empty().
if (waiting && (tasks_running == 0) && BS_THREAD_POOL_PAUSED_OR_EMPTY)
tasks_done_cv.notify_all();
tasks_lock.lock();
task_available_cv.wait(tasks_lock,
[this]
{
return !BS_THREAD_POOL_PAUSED_OR_EMPTY || !workers_running;
});
if (!workers_running)
break;
{
#ifdef BS_THREAD_POOL_ENABLE_PRIORITY
const std::function<void()> task = std::move(std::remove_const_t<pr_task&>(tasks.top()).task);
tasks.pop();
#else
const std::function<void()> task = std::move(tasks.front());
tasks.pop();
#endif
++tasks_running;
tasks_lock.unlock();
task();
}
tasks_lock.lock();
}
Minimal working example
Build with address sanitizer, and it trips.
Behavior
Address sanitizer should be happy.
System information
- x86
- Linux
Additional information
Reads of non-atomic data must be mutex-protected in C++. There's a chance that the undefined behavior that produces will "work", but that's still dicey. Worse is tasks.empty()
which potentially is reading two pointers and comparing them. Even if they were atomic reads, there's a data race between the reads.
Solution: Just don't unlock the mutex:
std::unique_lock tasks_lock(tasks_mutex);
while (true)
{
--tasks_running;
if (waiting && (tasks_running == 0) && BS_THREAD_POOL_PAUSED_OR_EMPTY)
{
tasks_done_cv.notify_all();
}
That will wake up threads just to have them hit the mutex, but most likely what we had will too (notifying then immediately locking).
Hi @BenFrantzDale and thanks for opening this issue! This issue is in fact already fixed in v5.0.0, which will be released in the next few days. In this version I made many changes, added new features, and fixed some bugs, including this one. The code is already finished, I'm just updating the documentation, and will release the new version as soon as I'm done.
@bshoshany that's great. Is that branch pushed? The issue seems to be in the master
branch right now...
I usually develop locally and only upload the finished product to GitHub. It will be up in a few days :)