Mbox with contention?

Question

Mbox with contention?

ilpropheta opened this issue 4 years ago · comments

Hi,
I hope it's the right place for asking questions.
Please, forgive me if this question is silly, I am learning the basics of SObjectizer.

I have understood that a Mbox enables to dispatch messages to several consumers. As explained here, every consumer receives a copy of each message.

Keeping N consumers, I am wondering if it's possible to have, instead, that every message is received and processed exactly by one consumer, the "first that is free" (clearly, this is a simplification). This way we have contention on the Mbox. It's the classical pattern "N jobs, M workers".

How can I properly model this pattern with SObjectizer?

Many thanks,

Marco

eao197 · Answer 1 · Fri Jan 22 2021 00:46:33 GMT+0800 (China Standard Time)

Hi, Marco!

Yes, it's the right place for asking questions like that.

SObjectizer's mboxes allow to implement of very different schemes of message delivery. You can find several implementations of "custom" mboxes in so5extra companion project: https://github.com/Stiffstream/so5extra/wiki/so5extra-1.4-docs
Maybe the round-robin mbox is a good approximation for your task.

In the scheme "first that is free" the main question is to detect what of subscribers is free.

In some cases "N jobs, M workers" can be solved in SObjectizer by using mchains instead of mboxes.

eao197 · Answer 2 · Fri Jan 22 2021 13:20:33 GMT+0800 (China Standard Time)

Hi, @ilpropheta !

Was my answer useful enough or maybe I should go deeper into some details about the relationship between mboxes, event-queues, and agents?

Marco Arena · Answer 3 · Fri Jan 22 2021 15:57:10 GMT+0800 (China Standard Time)

Hi @eao197 ,
thanks for your reply! I'll have a look at the links above and do some experiments today.

Just to give you a bit of context, I need to port to Linux a Windows application that makes extensive usage of PPL Async Message Blocks which are, basically, async queues. Agents (or just threads) can dequeue items from such queues autonomously.

I guess Mbox implements a "push" schema since the messages are dispatched to agents according to some policy, whereas the PPL async message blocks model a "pull" schema.

If I get it right, an mchain implements a pull schema, that can be useful for me. So I have two additional questions:

I see that mchains are recommended when agents need to interact with entities outside the SObjectizer environment. What if they are used intra-agents, instead?
what's the effect of calling receive on the same mchain from multiple threads/agents? Is every message propagated to every consumer or to just one?

Also, I'll give a try to round-robin mbox and check out other policies!

Many thanks!

eao197 · Answer 4 · Fri Jan 22 2021 16:29:36 GMT+0800 (China Standard Time)

I guess Mbox implements a "push" schema since the messages are dispatched to agents according to some policy, whereas the PPL async message blocks model a "pull" schema.

Yes. Mbox doesn't store messages usually. Ordinary Mbox just holds a reference to the receiver's event-queue and pushes a new message to that queue when that message is passed to the Mbox via send function (or by the timer). So Mbox doesn't know when this message will be dequeued and handled. And that makes the detection of a free agent difficult.

Usually, that task is solved by using a collector agent and several performer agents. Some explanation of that scheme can be found in the old Wiki on SourceForge. That explanation is related to SO-5-5, but the same principle will work for SO-5-7 too.

I see that mchains are recommended when agents need to interact with entities outside the SObjectizer environment. What if they are used intra-agents, instead?

What do you mean under the term "intra-agents"?

what's the effect of calling receive on the same mchain from multiple threads/agents? Is every message propagated to every consumer or to just one?

To just one.

Marco Arena · Answer 5 · Fri Jan 22 2021 18:23:22 GMT+0800 (China Standard Time)

Many thanks @eao197! I have something to read and try, then :)

I was unclear on "intra-agents", my apologies. I just mean that instead of sharing the mchain with the external environment, I use it to communicate among my agents (as it was an Mbox). As you pointed out, this could be a way to emulate the "N problems, M workers" pattern. However, you think it's a good way or could I get any issues (e.g. performance) ?

Many thanks!

eao197 · Answer 6 · Fri Jan 22 2021 19:31:01 GMT+0800 (China Standard Time)

You can easily use mchains for communications between your agents. But you have to take the following moments into the account:

The send to a mchain can block the sender for some time. Sometimes it can be useful (for example, it's a way of defending the receiver from overload), but sometimes it can be inappropriate. The send for a mbox is non-blocking (usually, but if you implement your own mbox you can do almost all you want, including blocking the sender).

If an agent reads messages from a mchain then the agent should select the right moments for calling receive (or select). When an agent subscribes to a message then an event-handler from the agent will be called automatically at the appropriate time. But if an agent has to read a mchain it has to do some checking for the presence of any message in the mchain. For example, an agent can call receive with the no_wait_on_empty() modificator, then check the result and send a delayed message to itself if mchain was empty to repeat an attempt later. Another approach is to use non_empty_notificator for the mchain and send a notification to an agent when a new message goes to the empty mchain.

Unfortunately, SObjectizer at the current point of its evolution doesn't have an easy way of integration between agents and mchains (I mean an agent cannot receive messages from a mchain just like it receives messages from a mbox). I think such integration will be useful sometimes but doesn't invent it yet :(

I think you can implement a simple custom mbox for solving "N tasks, M workers" problem. I'll make a sketch of such mbox in another reply.

eao197 · Answer 7 · Fri Jan 22 2021 20:14:24 GMT+0800 (China Standard Time)

This is a 5 minutes sketch of a custom mbox that allows to solve "N tasks, M workers" problem. You have to pass a reference for that class to all your workers and each worker should call ready method to inform the mbox that there is yet another free worker.

class workload_distributing_mbox_t : public so_5::abstract_message_box_t {
   // Every mbox should have a reference to SOEnv and own id.
   so_5::environment_t & m_env;
   so_5::mbox_id_t m_id;

   // We have to use mutex to protect the content.
   std::mutex m_lock;

   // We need a container for pending messages.
   struct pending_msg_info_t {
      std::type_index m_type;
      so_5::message_ref_t m_message;
   };
   std::queue<pending_msg_info_t> m_pending_messages;

   // We need a container for direct mboxes of free workers.
   std::queue<so_5::mbox_t> m_free_workers;

public:
   workload_distributing_mbox_t(
      so_5::environment_t & env,
      so_5::mbox_id_t id)
      :  m_env{env}, m_id{id}
   {}

   mbox_id_t
   id() const override { return m_id; }

   // Do not support subscription. Just throw an exception.
   void
   subscribe_event_handler(
      const std::type_index & /*type_index*/,
      const message_limit::control_block_t * /*limit*/,
      agent_t & /*subscriber*/ ) override
   {
      throw std::runtime_error("subscription is not supported");
   }

   // Do not support subscription. Just throw an exception.
   void
   unsubscribe_event_handlers(
      const std::type_index & /*type_index*/,
      agent_t & /*subscriber*/ ) override
   {
      throw std::runtime_error("unsubscription is not supported");
   }

   std::string
   query_name() const override
   {
      return "workload_distribution_mbox:" + std::to_string(m_id);
   }

   // It's multi-producer but single-consumer mbox.
   so_5::mbox_type_t
   type() const override
   {
      return so_5::mbox_type_t::multi_producer_single_consumer;
   }

   // The implementation below.
   void
   do_deliver_message(
      const std::type_index & msg_type,
      const message_ref_t & message,
      unsigned int overlimit_reaction_deep ) override;

   // The implementation below.
   void
   ready(
      so_5::agent_t & free_worker );

   // Delivery filters is not supported. So just throw an exception.
   void
   set_delivery_filter(
      const std::type_index & /*msg_type*/,
      const delivery_filter_t & /*filter*/,
      agent_t & /*subscriber*/ ) override
   {
      throw std::runtime_error("delivery-filters are not supported");
   }

   void
   drop_delivery_filter(
      const std::type_index & /*msg_type*/,
      agent_t & /*subscriber*/ ) noexcept override
   {
      // Just do nothing. This method won't be called.
   }

   so_5::environment_t &
   environment() const noexcept override
   {
      return m_env;
   }
};

void
workload_distributing_mbox_t::do_deliver_message(
   const std::type_index & msg_type,
   const message_ref_t & message,
   unsigned int overlimit_reaction_deep ) override
{
   std::lock_guard<std::mutex> lock{ m_lock };

   // If there is a free worker then deliver the message to it
   // via its direct mbox.
   if( !m_free_workers.empty() )
   {
      so_5::mbox_t dest_mbox{ std::move(m_free_workers.front()) };
      m_free_workers.pop();

      dest_mbox->do_deliver_message( msg_type, message, overlimit_reaction_deep );
   }
   else
   {
      // Otherwise just store message to be sent later.
      m_pending_messages.emplace( msg_type, message );
   }
}

void
workload_distributing_mbox_t::ready(
   so_5::agent_t & free_worker )
{
   so_5::mbox_t worker_mbox = free_worker.so_direct_mbox();

   std::lock_guard<std::mutex> lock{ m_lock };

   // If there are some pending messages then send the first one.
   if( !m_pending_messages.empty() )
   {
      pending_msg_info_t msg_info{ std::move(m_pending_messages.front()) };
      m_pending_messages.pop();

      worker_mbox->do_deliver_message(
            msg_info.m_type,
            msg_info.m_message,
            0u /* no overlimit handling here*/ );
   }
   else
   {
      // Otherwise we have to store this worker.
      m_free_workers.emplace( std::move(worker_mbox) );
   }
}

And I, probably, will mark ready as noexcept because if this method throws there is no easy way to repair the situation.

This is just a sketch, but I think it should work.

Marco Arena · Answer 8 · Fri Jan 22 2021 22:09:29 GMT+0800 (China Standard Time)

Thanks a lot for this and for all the information! I'll check it out.

Before reading both your messages, I came out with the following usage of mchains for my scenario:

struct acquired_value {
    std::chrono::steady_clock::time_point acquired_at_;
    int value_;
};

class producer final : public so_5::agent_t {
    so_5::mchain_t chain_;
    so_5::timer_id_t timer_;
    int counter_{};

    struct acquisition_time final : public so_5::signal_t {};

    void on_timer(mhood_t<acquisition_time>) 
    {
         so_5::send<acquired_value>(chain_, std::chrono::steady_clock::now(), ++counter_);
    }

public:
    producer(context_t ctx, so_5::mchain_t board)
        : so_5::agent_t{ std::move(ctx) }
        , chain_{ std::move(board) }
    {}

    void so_define_agent() override {
        so_subscribe_self().event(&producer::on_timer);
    }

    void so_evt_start() override {
        timer_ = so_5::send_periodic<acquisition_time>(*this, 0ms, 50ms);
    }
};

class consumer final : public so_5::agent_t {
    const so_5::mchain_t board_;
    const std::string name_;

    void run(bool)
    {
        so_5::receive(from(board_).handle_all(), [this](const acquired_value& r) {
            if (name_ == "first")
            {
                // simulate some delay on first worker
                std::this_thread::sleep_for(150ms);
            }
            std::cout << std::this_thread::get_id() << " " << name_ << ": " << r.value_ << std::endl;
        });
    }

public:
    consumer(context_t ctx, so_5::mchain_t board, std::string name)
        : so_5::agent_t{ std::move(ctx) }
        , board_{ std::move(board) }
        , name_{ std::move(name) }
    {}

    void so_define_agent() override 
    {
        so_subscribe_self().event(&consumer::run);
    }

    void so_evt_start() override
    {
        so_5::send<bool>(*this, true);
    }
};

int main() {
    so_5::launch([](so_5::environment_t& env) {
        so_5::mchain_params_t params{ so_5::mchain_props::capacity_t{} };
        so_5::mchain_t board = env.create_mchain(params);
        env.introduce_coop(so_5::disp::active_obj::make_dispatcher(env).binder(), [board](so_5::coop_t& coop) {
            coop.make_agent<producer>(board);
            coop.make_agent<consumer>(board, "first"s);
            coop.make_agent<consumer>(board, "second"s);
        });

        std::this_thread::sleep_for(std::chrono::seconds(4));
        env.stop();
        so_5::close_retain_content(board);
    });

    return 0;
}

Sending a fake boolean is needed just to avoid blocking in the initialization of each consumer agent (I don't know if it exists a sort of "run" function already). Once the boolean is handled, the agent gets blocked in freeing up the chain. As-is, I think this approach cannot work with only one thread, since receive blocks.

However, I cannot rely on a blocking send for my use case, so good to know about that. I will check your implementation in detail.

Many thanks for you help!

eao197 · Answer 9 · Fri Jan 22 2021 22:50:18 GMT+0800 (China Standard Time)

Sending a fake boolean is needed just to avoid blocking in the initialization of each consumer agent

Yes, it's the right approach.

As-is, I think this approach cannot work with only one thread, since receive blocks.

Yes, you have to bind your consumers to different contexts (for example, to different instances of one_thread dispatcher, or to active_disp dispatcher).

However, I cannot rely on a blocking send for my use case, so good to know about that. I will check your implementation in detail.

The appropriate information is here: https://github.com/Stiffstream/sobjectizer/wiki/SO-5.7-InDepth-Message-Chains#types-of-mchains
send on full size-limited mchain will block. Size-unlimited mchains do not block senders.

Marco Arena · Answer 10 · Fri Jan 22 2021 23:02:38 GMT+0800 (China Standard Time)

Great to know then! Size-unlimited mchains fit well my scenario (they are conceptually similar to PPL's unbounded_buffer that my project makes us of).

Many thanks @eao197 for your in-depth replies! I am really grateful for you help.

eao197 · Answer 11 · Fri Jan 22 2021 23:23:38 GMT+0800 (China Standard Time)

I hope you will find SObjectizer to be a useful tool for your task. Feel free to ask more if you encounter some problems or dark corners.

May I ask you to share a reference to SObjectizer project somewhere like Twitter/LinkedIn/Facebook/etc? SObjectizer is a mature project with a long history but it isn't widely known outside the Russian segment of the Internet. References from those who take a look at it can change this situation.

Marco Arena · Answer 12 · Fri Jan 22 2021 23:44:12 GMT+0800 (China Standard Time)

I think this project is awesome, thanks a lot for making it real! I will be happy to spread something on my channels.

Let me try something more: would you (or someone from the SObjectizer community) be interested in participating to a meetup of my community? We are open to proposals and I think having a talk about SObjectizer would be amazing! We can discuss via dm about this if you are interested!

eao197 · Answer 13 · Sat Jan 23 2021 14:31:12 GMT+0800 (China Standard Time)

I thought for some time and it seems that if you need a simple agent that starts and then only calls receive(from(board_).handle_all()...) and occupies the worker thread it's bound to then it can be done without an additional message and separate run method:

class consumer final : public so_5::agent_t {
    const so_5::mchain_t board_;
    const std::string name_;

public:
    consumer(context_t ctx, so_5::mchain_t board, std::string name)
        : so_5::agent_t{ std::move(ctx) }
        , board_{ std::move(board) }
        , name_{ std::move(name) }
    {}

    void so_evt_start() override
    {
        so_5::receive(from(board_).handle_all(), [this](const acquired_value& r) {
            if (name_ == "first")
            {
                // simulate some delay on first worker
                std::this_thread::sleep_for(150ms);
            }
            std::cout << std::this_thread::get_id() << " " << name_ << ": " << r.value_ << std::endl;
        });
    }
};

The so_evt_start is the first method of an agent called on the working context after successful registration of that agent. And if there are no more agents on this context then you can do anything just in so_evt_start.

We are open to proposals and I think having a talk about SObjectizer would be amazing!

Thanks for that opportunity. We'll discuss it inside our team and fill a proposal the next week.

Marco Arena · Answer 14 · Sun Jan 24 2021 19:54:45 GMT+0800 (China Standard Time)

Many thanks @eao197 , I'll give it a try!

Thanks for that opportunity. We'll discuss it inside our team and fill a proposal the next week.

Wow! That's awesome. Looking forward to it!

Marco Arena · Answer 15 · Tue Feb 09 2021 16:02:06 GMT+0800 (China Standard Time)

Hi again! I have another question but I open a new issue and close this one.
Many thanks for your help here, everything works smoothly!