SUB sockets receiving frequent multipart messages can crash upon disconnection
sissow2 opened this issue · comments
Issue description
This is in relation to zeromq/libzmq#2163, where libzmq will not support disconnecting a socket while receiving a multipart message. This goes pretty deep into the library, and actually extends to any operation that reads a socket's internal queues. This includes zmq_getsockopt(..., ZMQ_EVENTS, ...)
.
To produce this:
- A subscriber is receiving a lot of multipart messages
azmq::detail::socket_ops::reactor_handler::schedule
executes (in my case, as a result of anasync_receive
), and due to the message volume triggers the case that posts the handler to the event queue.- Before the event queue has a chance to handle the handler posted in step 2, the user decides to disconnect the socket. As per the issue in libzmq posted above, this puts the socket into an inconsistent state and it soon crashes.
The actual errors are assertions (!more
in fq.cpp
) or the next async operation actually just blocking.
I was able to prevent this from happening in my (single-threaded) application with the following patch:
diff --git a/azmq/detail/socket_service.hpp b/azmq/detail/socket_service.hpp
index e4557c8..c874bde 100644
--- a/azmq/detail/socket_service.hpp
+++ b/azmq/detail/socket_service.hpp
@@ -617,7 +617,7 @@ namespace detail {
auto evs = socket_ops::get_events(impl->socket_, ec) & impl->events_mask();
if (evs || ec) {
- impl->sd_->get_io_service().post([handler, ec] { handler(ec, 0); });
+ impl->sd_->get_io_service().dispatch([handler, ec] { handler(ec, 0); });
} else {
impl->sd_->async_read_some(boost::asio::null_buffers(),
std::move(handler));
This needed to be done because get_events
calls zmq_getsockopt
with ZMQ_EVENTS
, which can potentially read the first message of a multipart message- which means the rest of it needs to be handled ASAP. This is not a good solution because it can still break in multi-threaded applications (dispatch
will post).
Environment
- libzmq version: 4.3.1
- AZMQ @ a8f54cc
- OS: Ubuntu 16.04
dispatch() is probably the right thing to do here instead of post anyway. I'm putting together an update to azmq, I will include this change.