arvidn / libtorrent

an efficient feature complete C++ bittorrent implementation

Home Page:http://libtorrent.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

eventfd_select_interrupter: No file descriptors available [system:24]

vktr opened this issue · comments

Please provide the following information

libtorrent version (or branch): 2.0.10 (via vcpkg)

platform/architecture: alpine musl x86_64

compiler and compiler version:

COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-alpine-linux-musl/13.2.1/lto-wrapper
Target: x86_64-alpine-linux-musl
Configured with: /home/buildozer/aports/main/gcc/src/gcc-13-20231014/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --build=x86_64-alpine-linux-musl --host=x8'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.2.1 20231014 (Alpine 13.2.1_git20231014)

please describe what symptom you see, what you would expect to see instead and
how to reproduce it.

In Porla we have support for multi-session, meaning we can configure multiple libtorrent sessions in a single process, each with its own settings, torrents, etc. We also use alpine and musl to build and link a static single-binary release for multiple platforms.

The issue is, when creating many sessions (around 48-49 from testing) the process crashes with the error eventfd_select_interrupter: No file descriptors available [system:24]. It might or might not be an issue in libtorrent, but I'll start here and maybe I can get pointed forward or we can solve it together 😃

The full backtrace is,

#0  __restore_sigs (set=set@entry=0x7fffffffd9a0)
    at ./arch/x86_64/syscall_arch.h:40
#1  0x00007ffff7fa9702 in raise (sig=sig@entry=6) at src/signal/raise.c:11
#2  0x00007ffff7f78be8 in abort () at src/exit/abort.c:11
#3  0x00007ffff7cb1a1e in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007ffff7cc3ea8 in __cxxabiv1::__terminate(void (*)()) ()
   from /usr/lib/libstdc++.so.6
#5  0x00007ffff7cc3f15 in std::terminate() () from /usr/lib/libstdc++.so.6
#6  0x00007ffff7cc4168 in __cxa_throw () from /usr/lib/libstdc++.so.6
#7  0x000055555560b9ef in boost::throw_exception<boost::system::system_error> (
    e=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/throw_exception.hpp:157
#8  0x0000555555606da9 in boost::asio::detail::do_throw_error (err=...,
    location=0x55555613fc21 "eventfd_select_interrupter", loc=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/throw_error.ipp:42
#9  0x00005555556069c6 in boost::asio::detail::throw_error (err=...,
    location=0x55555613fc21 "eventfd_select_interrupter", loc=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/throw_error.hpp:51
#10 0x000055555560842d in boost::asio::detail::eventfd_select_interrupter::open_descriptors (this=0x7ffff4480800)
#11 0x000055555560825e in boost::asio::detail::eventfd_select_interrupter::eventfd_select_interrupter (this=0x7ffff4480800)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/eventfd_select_interrupter.ipp:44
#12 0x00005555556087d4 in boost::asio::detail::epoll_reactor::epoll_reactor (
    this=0x7ffff4480790, ctx=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/epoll_reactor.ipp:44
#13 0x000055555560e835 in boost::asio::detail::service_registry::create<boost::asio::detail::epoll_reactor, boost::asio::execution_context> (
    owner=0x7ffff447bc60)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.hpp:86
#14 0x00005555556072fb in boost::asio::detail::service_registry::do_use_service
    (this=0x7ffff44ae460, key=...,
    factory=0x55555560e802 <boost::asio::detail::service_registry::create<boost::asio::detail::epoll_reactor, boost::asio::execution_context>(void*)>,
    owner=0x7ffff447bc60)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.ipp:132
#15 0x000055555560d9a4 in boost::asio::detail::service_registry::use_service<boost::asio::detail::epoll_reactor> (this=0x7ffff44ae460)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.hpp:30
#16 0x000055555560c720 in boost::asio::use_service<boost::asio::detail::epoll_reactor> (e=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/impl/execution_context.hpp:35
#17 0x00005555556c9ee2 in boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system_clock, boost::asio::wait_traits<std::chrono::_V2::(
    this=0x7ffff44b0f00, context=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/deadline_timer_service.hpp:72
#18 0x00005555556bf069 in boost::asio::detail::service_registry::create<boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system_cloc)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.hpp:86
#19 0x00005555556072fb in boost::asio::detail::service_registry::do_use_service
    (this=0x7ffff44ae460, key=...,
    factory=0x5555556bf036 <boost::asio::detail::service_registry::create<boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system_clock, boost::asio::wait_traits<std::chrono::_V2::system_clock> > >, boost::asio::io_context>(void*)>, owner=0x7ffff447bc60)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.ipp:132
#20 0x00005555556b35ec in boost::asio::detail::service_registry::use_service<boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/impl/service_registry.hpp:39
#21 0x00005555556a0c0b in boost::asio::use_service<boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system_clock, boost::asio::wait_)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/impl/io_context.hpp:41
#22 0x000055555568612a in boost::asio::detail::io_object_impl<boost::asio::detail::deadline_timer_service<boost::asio::detail::chrono_time_traits<std::chrono::_V2::system_clock, boost::(
    this=0x7ffff4557088, context=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/detail/io_object_impl.hpp:58
#23 0x000055555566c570 in boost::asio::basic_waitable_timer<std::chrono::_V2::system_clock, boost::asio::wait_traits<std::chrono::_V2::system_clock>, boost::asio::any_io_executor>::basic_waitable_timer<boost::asio::io_context> (
    this=0x7ffff4557088, context=...)
    at /home/viktor/code/lt-multi-session/build/vcpkg_installed/x64-linux/include/boost/asio/basic_waitable_timer.hpp:198
#24 0x0000555555cc7fd0 in libtorrent::aux::disk_io_thread_pool::disk_io_thread_pool (this=0x7ffff4557028, thread_iface=..., ios=...)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/disk_io_thread_pool.cpp:56
#25 0x0000555555b2402a in libtorrent::mmap_disk_io::mmap_disk_io (
    this=0x7ffff4556cd0, ios=..., sett=..., cnt=...)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/mmap_disk_io.cpp:396
#26 0x0000555555b2e5de in std::make_unique<libtorrent::mmap_disk_io, boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::counters&> ()
    at /usr/include/c++/13.2.1/bits/unique_ptr.h:1070
#27 0x0000555555b23e26 in libtorrent::mmap_disk_io_constructor (ios=...,
    sett=..., cnt=...)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/mmap_disk_io.cpp:384
#28 0x0000555555604cda in libtorrent::default_disk_io_constructor (ios=...,
    sett=..., cnt=...)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/session.cpp:543
#29 0x00005555556b2557 in std::__invoke_impl<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libtorrent::disk_interface> >, std::unique_ptr<libtorrent::disk_interface, s(
    __f=@0x7fffffffe370: 0x555555604c9b <libtorrent::default_disk_io_constructor(boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::counters&)>) at /usr/includ1
#30 0x000055555569f7cb in std::__invoke_r<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libtorrent::disk_interface> >, std::unique_ptr<libtorrent::disk_interface, std:(
    __fn=@0x7fffffffe370: 0x555555604c9b <libtorrent::default_disk_io_constructor(boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::counters&)>) at /usr/inclu6
#31 0x00005555556843ce in std::_Function_handler<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libtorrent::disk_interface> > (boost::asio::io_context&, libtorrent::settings_interface const&, libt-
ace> > (*)(boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::counters&)>::_M_invoke(std::_Any_data const&, boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::co
    __args#0=..., __args#1=..., __args#2=...)
    at /usr/include/c++/13.2.1/bits/std_function.h:291
#32 0x000055555566abd1 in std::function<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libtorrent::disk_interface> > (boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::c)
    at /usr/include/c++/13.2.1/bits/std_function.h:591
#33 0x0000555555614954 in libtorrent::aux::session_impl::session_impl(boost::asio::io_context&, libtorrent::settings_pack const&, std::function<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libto
    disk_io_constructor=..., flags=...)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/session_impl.cpp:526
#34 0x0000555555610e34 in std::_Construct<libtorrent::aux::session_impl, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack, std::function<std::unique_ptr<libtorrent::disk_interface, std::d-
isk_interface> > (boost::asio::io_context&, libtorrent::settings_interface const&, libtorrent::counters&)>, libtorrent::flags::bitfield_flag<unsigned char, libtorrent::session_flags_tag, void> const&>(libtorrent::)
    at /usr/include/c++/13.2.1/bits/stl_construct.h:119
#35 0x000055555560fdf4 in std::allocator_traits<std::allocator<void> >::construct<libtorrent::aux::session_impl, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack, std::function<std::uniqu)
    at /usr/include/c++/13.2.1/bits/alloc_traits.h:660
#36 std::_Sp_counted_ptr_inplace<libtorrent::aux::session_impl, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack, std::function<std::unique_ptr<libtorrent::disk_interface, std::default_delete<)
    at /usr/include/c++/13.2.1/bits/shared_ptr_base.h:604
#37 0x000055555560f3aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<libtorrent::aux::session_impl, std::allocator<void>, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_-
rrent::flags::bitfield_flag<unsigned char, libtorrent::session_flags_tag, void> const&) (this=0x7fffffffe788, __p=@0x7fffffffe780: 0x0, __a=...)
    at /usr/include/c++/13.2.1/bits/shared_ptr_base.h:971
#38 0x000055555560eba8 in std::__shared_ptr<libtorrent::aux::session_impl, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<void>, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack
    __tag=...) at /usr/include/c++/13.2.1/bits/shared_ptr_base.h:1712
#39 0x000055555560e12f in std::shared_ptr<libtorrent::aux::session_impl>::shared_ptr<std::allocator<void>, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack, std::function<std::unique_ptr<-
:io_context>&&, libtorrent::settings_pack&&, std::function<std::unique_ptr<libtorrent::disk_interface, std::default_delete<libtorrent::disk_interface> > (boost::asio::io_context&, libtorrent::settings_interface co)
    at /usr/include/c++/13.2.1/bits/shared_ptr.h:464
#40 0x000055555560d06b in std::make_shared<libtorrent::aux::session_impl, std::reference_wrapper<boost::asio::io_context>, libtorrent::settings_pack, std::function<std::unique_ptr<libtorrent::disk_interface, std::0
#41 0x000055555560376a in libtorrent::session::start (this=0x7ffff44a8e20,
    flags=..., params=..., ios=0x7ffff447bc60)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/session.cpp:309
#42 0x00005555556041b2 in libtorrent::session::session (this=0x7ffff44a8e20)
    at /home/viktor/code/lt-multi-session/vendor/vcpkg/buildtrees/libtorrent/src/v2.0.10-8a071cab05.clean/src/session.cpp:391
#43 0x0000555555602e58 in void std::_Construct<libtorrent::session>(libtorrent::session*) ()
#44 0x00005555556029a9 in std::_Sp_counted_ptr_inplace<libtorrent::session, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<>(std::allocator<void>) ()
#45 0x00005555556026b7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<libtorrent::session, std::allocator<void>>(libtorrent::session*&, std::_Sp_alloc_shared_tag<std::allocator<void> >) ()
#46 0x0000555555602302 in std::__shared_ptr<libtorrent::session, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<void>>(std::_Sp_alloc_shared_tag<std::allocator<void> >) ()
#47 0x0000555555601f0b in std::shared_ptr<libtorrent::session>::shared_ptr<std::allocator<void>>(std::_Sp_alloc_shared_tag<std::allocator<void> >) ()
#48 0x0000555555601a81 in std::shared_ptr<std::enable_if<!std::is_array<libtorrent::session>::value, libtorrent::session>::type> std::make_shared<libtorrent::session>() ()
#49 0x00005555556001ac in main ()

To reproduce this, I build and run the following (in alpine 3.19),

#include <libtorrent/session.hpp>

int main(int argc, char* argv[])
{
	std::vector<std::shared_ptr<libtorrent::session>> sessions;

	for (int i = 0; i < 2000; i++)
	{
		printf("Session %d\n", i + 1);
		sessions.emplace_back(std::make_shared<libtorrent::session>());
	}

	return 0;
}

Increasing file limit with ulimit -n 102400 makes the program pass, though.

Happy to provide more information if needed!

When compiled statically with muslc, the crash is reproducible even on a glibc host. See attached Valgrind outputs below. Hence setting ulimit looks like a temporary fix. In practice on a glibc host the snippet app crashes far sooner though: around 5-6 iterations on my machine. If there's e.g. only a single lt session, the statically compiled muslc version of Porla will work fine (downloading, seeding, DHT, etc. works fine).

However when compiled on a glibc host there are no issues even with 2000 iterations and with open files limited to 1024.

musl-alpine-guest-valgrind.txt
glibc-arch-host-valgrind.txt

However when compiled on a glibc host there are no issues even with 2000 iterations and with open files limited to 1024.

are you sure it was limited? i tested this quickly on arch and it crashes the same way, yielding roughly 50 sessions per 1k fds allowed before crashing, almost the same as on a musl host, which is expected (it makes a new epoll instance per session, etc). there's no actual bug i can see here except legitimately running out of fds (which the application has to report to the user saying 'please let me use more fds' or whatnot, catching exception, ..). this doesn't seem like any platform difference..

i tested this quickly

to be exact- the small repro above, which would translate to any 'open a bunch of libtorrent sessions' afaik. maybe the application itself does something very different in the end or has other stuff going on :)

ah and someone forwarded this to me saying it was musl related so i misread the initial context; the other part here is whether libtorrent itself should do anything about it, that i'm not sure. since you get a meager system_error exception that might not be the easiest to handle (unless you get EMFILE in errno too passed through to make it obvious? a quick test shows that you do have it set, so you can use try/catch and check errno in catch), but as long as you don't leak memory from the session init aborting and can continue i think that should still be ok.

not sure about what is intended from libtorrent api though, arvidn would be able to say for sure.

by "crashing" it seems you mean "throws an exception". you can handle this error by catching it.

As far as I can tell, you're running out of file descriptors. Each session will require a minimum number of them. If you really need many sessions without granting the process a higher ulimit, you can lower the size of the file pool and the max number of peer connections in the sessions.

In your test, though, it seems like you're running our of FDs purely on the mandatory boost.asio sockets/pipes.

I don't think raising ulimit is a temporary fix. It's the actual fix. ulimit is meant to cause this kind of failure when a process is attempting to open too many file descriptors. In your case, you want more, so you need to raise the limit.

You can do that programatically to an extent using setrlimit.

Thanks for the insightful responses ☺️

When we first had this issue like 6 months back I think it crashed at around 5-10 sessions which was much closer to the amount of sessions I think users will have. If we can get 40+ sessions consistently I don't think it's an issue in practice.

Is it possible that alpine has raised some limit here?

If raising the ulimit is the actual fix, that's fine. We just need to document it.

Thanks again for taking the time all of you!

Is it possible that alpine has raised some limit here?

nope, the defaults are the standard 1024 soft 4096 hard, like most distros. on distros like debian it's usually something like 1024 soft and infinite hard (the soft is current in effect for your application; ulimit -n will update it, or using setrlimit). anything past that requires raising the hard one first- which has to be user configuration (nothing you can do in code yourself automatically)

in general, most torrent clients/daemons have operated on this principle for a very long time. every open file is +1 to the limit even if you were not spamming sessions, epolls, asio, whatever. so torrent daemon operators already know "if i seed 500 torrents i have to raise this" (even if the daemon runs fine and cleans up handles to stay in limit, they'd want them to stay open for performance reasons and raise it); you only need to include a note and provide support to people that might be using your client as the first one :)

for the catch part, i'm not the best programmer, but an example:

	for (int i = 0; i < 2000; i++)
	{
		printf("Session %d\n", i + 1);
		try {
			sessions.emplace_back(std::make_shared<libtorrent::session>());
		} catch (...) {
			if (errno == EMFILE) {
				/* out of files from making session */
				perror("Failed to create new session: ");
				/* completely out of files so even exit() crashes in cleanup here, cleanup the last created session */
				sessions.pop_back();
				/* whatever you want */
			} else { /* other errnos, or do nothing, idk */ }
		}
	}

actually, disregard the pop_back- that deletes the previous session (not the failed one). you just can't call exit in catch{} and i got confused :)

Apologies for the confusion, communication is hard. There are two separate things here:

  • dynamic binary compiled with glibc running out of FD's (everything is normal in this case, ulimit is the fix)
  • static binary compiled with musl crashing for some other reason. (the real issue, ulimit won't fix this)

There's slight miscommunication here, even between Viktor and me who are behind this report.

To cut to the chase: the first case is solved and explained, thank you all. However, the second case is the actual problem here which I tried to work out with Viktor. In hindsight, I should've submitted the report.

are you sure it was limited? i tested this quickly on arch and it crashes the same way

(Regarding the binary compiled with glibc)

Yes, I checked with ulimit -a before I posted. Double-checked right now. However, it was run through CLion and in CLion it works fine up to 2k sessions (same as with Valgrind; I think both call prlimit i.e. setrlimit Arvid mentioned). However, when running in a terminal the same binary crashes between 38 and 46 at 1024 open files limit. If I e.g. set the sessions iteration count to 5000 and set the open files ulimit to a million, then it won't crash. I agree that in this case everything is behaving correctly.

However, with the musl static build (2nd case), regardless of ulimit, the repro crashes before 5-6 sessions (usually around 2-3 sessions), except when run through Valgrind, then I can get upwards of 1k sessions with the usual 1024 limit.

almost the same as on a musl host, which is expected (it makes a new epoll instance per session, etc). <there's no actual bug i can see here except legitimately running out of fds (which the application has to report to the user saying 'please let me use more fds' or whatnot, catching exception, ..). this doesn't seem like any platform difference..

I've done a fresh Alpine VM install and compiled a fresh binary after your post, the musl static build crashes both on the musl and glibc hosts after the mentioned 5-6 sessions and not because of EMFILE. Unless run in Valgrind where it throws EMFILE after 38-46 sessions same as with glibc.

The static musl build crashes consistently for me before 5-6 sessions:

  • on my host Arch Linux (kernel 6.7.4)
  • on my guest Alpine VM (kernel 6.6.16)
  • on my laptop Void Linux (kernel 6.3)

musl static: strace on glibc host https://gist.github.com/stacksmash76/ffa8ea5e3f71d733ecf8d2acfc87b22f
musl static: strace on musl host https://gist.github.com/stacksmash76/65bd80848111eab20be821d8aad3ba8d
musl static: trace on glibc host https://gist.github.com/stacksmash76/1e7b458cf43b996f7d423974321b6928

(Ignore porla wherever you see it, in all cases this is the minimal repro code from the issue)

Viktor was also kind enough to make a repository for the repro, so please try to reproduce this instead of relying on my info: https://github.com/vktr/lt-multisession-issue

EDIT: I've seen 4 different stacktraces in total for the crashes: https://gist.github.com/stacksmash76/fcfd6f27f437a7f9141dad36b0154ea8

And Valgrind gave 2 warnings: https://gist.github.com/stacksmash76/e923650538ed5b6a6b1a1cf7f917afc6

Chiming in here 👋 I'm building the repro code on this system,

Linux alpine-dev 6.6.17-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Wed, 21 Feb 2024 07:47:07 +0000 x86_64 Linux

Running the resulting binary (which I attached in the repro repo under binaries/) on the same system it was built, it will run for ~49 iterations before crashing with the "No file descriptors available" error, and ulimit -n is the fix.

Running the resulting binary on Arch or Ubuntu will crash after 2-5 iterations with a segfault.

EDIT: I've seen 4 different stacktraces in total for the crashes: gist.github.com/stacksmash76/fcfd6f27f437a7f9141dad36b0154ea8

ah, ok. all of these are something invalid passed to free() somewhere (though it's weird this only sometimes happens in the static-built version)

out of curiosity, can you try pass -Wl,-z,stack-size=0x200000 when linking the static binary?

I did my best but it had no effect on either system as far as I can tell. That said, I might have passed it wrong - I just verbatim copied it into the cmake linker flags I already passed

I think it would make sense to abandon this ticket and start a new one, for the new issue. I also think it would be really helpful to describe the issue without using the word "crash", it just is too meaningless. The gdb stack traces you posted are truncated to not include the actual problem. why did the process stop at that point?

the stack traces suggests that, perhaps, you're experiencing heap corruption. Have you tried running with address-sanitizer enabled? and maybe undefined-sanitizer too, while you're at it.

Sure, as I mentioned it might not even be an issue in libtorrent but I don't know enough C++ to help myself 🥲 I'll organize some thought gathering and see if we can open a better worded issue with more concrete details.

the stack traces suggests that, perhaps, you're experiencing heap corruption. Have you tried running with address-sanitizer enabled? and maybe undefined-sanitizer too, while you're at it.

sadly this only seems to happen when statically linked (though only with the vcpkg repro.. if i just normally build libtorrent-rasterbar myself and link it this does not crash even when static), and you can't links sanitizers into static exes. would've been convenient though as this is the most likely reason :)

good luck for the new issue ^^

it's probably an issue of ABI incompatibility then. make sure to build the client with the same libtorrent configuration as the vcpkg binary you link against.

i doubt it since all of the dependencies come from vcpkg (except for the compiler and random tools)

but i guess that shows that you should find a repro that doesn't involve vcpkg. if i use distro-provided boost, openssl, ... and build libtorrent and link it, this doesn't happen. only using vcpkg makes it break (with the same versions of everything). and because vcpkg is building a bunch of stuff here (and i don't know anything about what it's really doing), it's hard to speculate why the output it gives you (the libtorrent-rasterbar.a and libssl/libcrypto.a) is then broken

something i tried while typing this, is to do a quick hack:

/lt-multisession-issue # fd libssl.a
build/vcpkg_installed/x64-linux-musl-release/lib/libssl.a
/lt-multisession-issue # cp /usr/lib/libssl.
libssl.a     libssl.so    libssl.so.3
/lt-multisession-issue # cp /usr/lib/libssl.a build/vcpkg_installed/x64-linux-musl-release/lib/libssl.a
/lt-multisession-issue # cp /usr/lib/libcrypto.a build/vcpkg_installed/x64-linux-musl-release/lib/libcrypto.a 
/lt-multisession-issue # ninja -C build
ninja: entering directory 'build'
[1/1] Linking CXX executable lt-multisession-test
/lt-multisession-issue # ./build/lt-multisession-test 
Session 1
Session 2
Session 3
Session 4
Session 5
Session 6
Session 7
Session 8
Session 9
Session 10
Session 11
Session 12

and it doesn't crash anymore. and this is actually a huge hack because the /usr openssl is 3.1 but the one vcpkg built is 3.2. so this isn't even guaranteed to work at all, and yet it fixes the crash. separately repeating these steps not using vcpkg but using my own openssl 3.2 on the host also doesn't crash. so it seems like the vcpkg openssl build is broken somehow, and that's where your issue is. my recommendation is to not use vcpkg i guess...