versatica / mediasoup

Cutting Edge WebRTC Video Conferencing

Home Page:https://mediasoup.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mediasoup worker died, exiting in 2 seconds...

miroslavpejic85 opened this issue · comments

Bug Report

System Information and Environment:

  • Operating System: Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-105-generic x86_64)
  • Mediasoup Version: 3.14.5
  • Mediasoup Client Version: 3.7.7
  • Compiler: gcc/g++/c++ 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
  • Node.js Version: v18.17.1
  • Npm Version: 10.7.0
  • Python Version: 3.10.12
  • Docker: 24.0.5, build 24.0.5-0ubuntu1~22.04.1
  • Docker-compose: v2.27.0

For reference: mediasoup.discourse.group

Issue Description:

Since upgrading to Mediasoup version 3.14.5, our system has encountered frequent instances of Mediasoup workers terminating unexpectedly. This behavior is indicated by the message Mediasoup worker died, exiting in 2 seconds....

Troubleshooting Steps:

Following the documentation, we managed obtained a core dump of the worker, which will be provided as an attachment for further analysis.

(gdb) bt
(gdb) bt
#0  0x000055604c64eb58 in RTC::TransportTuple::GetProtocol (this=0x556518627283) at ../../../include/RTC/TransportTuple.hpp:92
#1  0x000055604c766313 in RTC::WebRtcTransport::OnIceServerTupleRemoved (this=0x55604e65a2b0, tuple=0x556518627283)
    at ../../../src/RTC/WebRtcTransport.cpp:1183
#2  0x000055604c64ef97 in RTC::IceServer::OnTimer (this=0x55604e615820, timer=0x55604e67fa90) at ../../../src/RTC/IceServer.cpp:935
#3  0x000055604c5ff52e in TimerHandle::OnUvTimer (this=0x55604e67fa90) at ../../../src/handles/TimerHandle.cpp:162
#4  0x000055604c5fe87b in onTimer (handle=0x55604e692230) at ../../../src/handles/TimerHandle.cpp:13
#5  0x000055604cb49005 in uv__run_timers (loop=0x55604e53bff0) at ../../../subprojects/libuv-v1.48.0/src/timer.c:193
#6  0x000055604cb4ec72 in uv_run (loop=0x55604e53bff0, mode=UV_RUN_DEFAULT) at ../../../subprojects/libuv-v1.48.0/src/unix/core.c:466
#7  0x000055604c5cda83 in DepLibUV::RunLoop () at ../../../src/DepLibUV.cpp:98
#8  0x000055604c5e06ed in Worker::Worker (this=0x7fff8c7dd100, channel=0x55604e53c5d0) at ../../../src/Worker.cpp:56
#9  0x000055604c5c3414 in mediasoup_worker_run (argc=16, argv=0x7fff8c7dd348, version=0x7fff8c7dd200 "3.14.5", consumerChannelFd=3, producerChannelFd=4,
    channelReadFn=0x0, channelReadCtx=0x0, channelWriteFn=0x0, channelWriteCtx=0x0) at ../../../src/lib.cpp:142
#10 0x000055604c80ed5f in main (argc=16, argv=0x7fff8c7dd348) at ../../../src/main.cpp:25
(gdb) bt full
#0  0x000055604c64eb58 in RTC::TransportTuple::GetProtocol (this=0x556518627283) at ../../../include/RTC/TransportTuple.hpp:92
No locals.
#1  0x000055604c766313 in RTC::WebRtcTransport::OnIceServerTupleRemoved (this=0x55604e65a2b0, tuple=0x556518627283)
    at ../../../src/RTC/WebRtcTransport.cpp:1183
No locals.
#2  0x000055604c64ef97 in RTC::IceServer::OnTimer (this=0x55604e615820, timer=0x55604e67fa90) at ../../../src/RTC/IceServer.cpp:935
        storedTuple = 0x556518627283
        it = <error reading variable: Cannot access memory at address 0x556518627283>
        __for_range = std::__cxx11::list = {[0] = {hash = 15945316816845144064, udpSocket = 0x55604e6c2030, udpRemoteAddr = 0x55604e6d24a0,
            tcpConnection = 0x0, localAnnouncedAddress = "", udpRemoteAddrStorage = {ss_family = 2,
              __ss_padding = "\335I%)\273\036", '\000' <repeats 111 times>, __ss_align = 0}, protocol = RTC::TransportTuple::Protocol::UDP}}
        __for_begin = <error reading variable: Cannot access memory at address 0x556518627273>
        __for_end = {hash = 1, udpSocket = 0x55604e6d2460, udpRemoteAddr = 0x55604e67fa90, tcpConnection = 0x0, localAnnouncedAddress = <error: Cannot access memory at address 0xe0>, udpRemoteAddrStorage = {ss_family = 1, __ss_padding = "\000\000\000\000\000\000xt\000\000\000\000\000\000\000\004\000\000\000\000\000\000\340YaN`U\000\000\003\000\000\000\000\000\000\000y\000\000\000\000\000\000\000", '\377' <repeats 16 times>, "\003\000\000\000\002\000\000\000\001", '\000' <repeats 31 times>, "a\000\000\000\000\000\000\000\260\326^N`U\000", __ss_align = 0}, protocol = (unknown: 0x80)}
        __FUNCTION__ = "OnTimer"
#3  0x000055604c5ff52e in TimerHandle::OnUvTimer (this=0x55604e67fa90) at ../../../src/handles/TimerHandle.cpp:162
No locals.
#4  0x000055604c5fe87b in onTimer (handle=0x55604e692230) at ../../../src/handles/TimerHandle.cpp:13
No locals.
#5  0x000055604cb49005 in uv__run_timers (loop=0x55604e53bff0) at ../../../subprojects/libuv-v1.48.0/src/timer.c:193
        heap_node = 0x55604e5fec88
        handle = 0x55604e692230
        queue_node = 0x55604e692298
        ready_queue = {next = 0x55604e629318, prev = 0x55604e721bd8}
#6  0x000055604cb4ec72 in uv_run (loop=0x55604e53bff0, mode=UV_RUN_DEFAULT) at ../../../subprojects/libuv-v1.48.0/src/unix/core.c:466
        timeout = 1
        r = 0
        can_sleep = 1
#7  0x000055604c5cda83 in DepLibUV::RunLoop () at ../../../src/DepLibUV.cpp:98
        __FUNCTION__ = "RunLoop"
        ret = 21856
#8  0x000055604c5e06ed in Worker::Worker (this=0x7fff8c7dd100, channel=0x55604e53c5d0) at ../../../src/Worker.cpp:56
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#9  0x000055604c5c3414 in mediasoup_worker_run (argc=16, argv=0x7fff8c7dd348, version=0x7fff8c7dd200 "3.14.5", consumerChannelFd=3, producerChannelFd=4, channelReadFn=0x0, channelReadCtx=0x0, channelWriteFn=0x0, channelWriteCtx=0x0) at ../../../src/lib.cpp:142
        worker = {<Channel::ChannelSocket::Listener> = {<Channel::ChannelSocket::RequestHandler> = {_vptr.RequestHandler = 0x55604cf6bb10 <vtable for Worker+16>}, <Channel::ChannelSocket::NotificationHandler> = {
              _vptr.NotificationHandler = 0x55604cf6bb58 <vtable for Worker+88>}, <No data fields>}, <SignalHandle::Listener> = {_vptr.Listener = 0x55604cf6bb80 <vtable for Worker+128>}, <RTC::Router::Listener> = {
            _vptr.Listener = 0x55604cf6bba8 <vtable for Worker+168>}, channel = 0x55604e53c5d0, signalHandle = 0x55604e5b3920, shared = 0x55604e5b2be0,
          mapWebRtcServers = {<absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::WebRtcServer*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >> = {<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::WebRtcServer*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >> = {
                settings_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<absl::lts_20230802::container_internal::CommonFields, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >, absl::lts_20230802::integer_sequence<unsigned long, 0, 1, 2, 3>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::CommonFields, 0, false>> = {
                      value = {<absl::lts_20230802::container_internal::CommonFieldsGenerationInfoDisabled> = {<No data fields>}, control_ = 0x55604cd31d40 <absl::lts_20230802::container_internal::kEmptyGroup+16>, slots_ = 0x0, capacity_ = 0,
                        compressed_tuple_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<unsigned long, absl::lts_20230802::container_internal::HashtablezInfoHandle>, absl::lts_20230802::integer_sequence<unsigned long, 0, 1>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<unsigned long, 0, false>> = {
                              value = 0}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::HashtablezInfoHandle, 1, true>> = {<absl::lts_20230802::container_internal::HashtablezInfoHandle> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringHash, 1, true>> = {<absl::lts_20230802::container_internal::StringHash> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringEq, 2, true>> = {<absl::lts_20230802::container_internal::StringEq> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >, 3, true>> = {<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >> = {<__gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}, <No data fields>}, <No data fields>},
          mapRouters = {<absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::Router*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >> = {<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::Router*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >> = {
                settings_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<absl::lts_20230802::container_internal::CommonFields, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >, absl::lts_20230802::integer_sequence<unsigned long, 0, 1, 2, 3>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::CommonFields, 0, false>> = {
                      value = {<absl::lts_20230802::container_internal::CommonFieldsGenerationInfoDisabled> = {<No data fields>}, control_ = 0x55604e6bef58, slots_ = 0x55604e6bef70, capacity_ = 3,
                        compressed_tuple_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<unsigned long, absl::lts_20230802::container_internal::HashtablezInfoHandle>, absl::lts_20230802::integer_sequence<unsigned long, 0, 1>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<unsigned long, 0, false>> = {
                              value = 2}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::HashtablezInfoHandle, 1, true>> = {<absl::lts_20230802::container_internal::HashtablezInfoHandle> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringHash, 1, true>> = {<absl::lts_20230802::container_internal::StringHash> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringEq, 2, true>> = {<absl::lts_20230802::container_internal::StringEq> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >, 3, true>> = {<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >> = {<__gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}, <No data fields>}, <No data fields>}, closed = false}
        channel = std::unique_ptr<Channel::ChannelSocket> = {get() = 0x55604e53c5d0}
        __FUNCTION__ = "mediasoup_worker_run"
#10 0x000055604c80ed5f in main (argc=16, argv=0x7fff8c7dd348) at ../../../src/main.cpp:25
        __FUNCTION__ = "main"
        version = "3.14.5"
        statusCode = 0

As @snnz said in the forum, this looks like the culprit:

https://mediasoup.discourse.group/t/mediasoup-worker-died-exiting-in-2-seconds/6035/7

It looks like after this commit 1 IceServer::OnTimer may end up calling IceServer::RemoveTuple, in the same way IceServer::~IceServer does.

I am on it.

@miroslavpejic85, PR here: #1393

However I may need your help if possible. Let's please follow up here in the PR: #1393 (comment)