ostinelli / syn

A scalable global Process Registry and Process Group manager for Erlang and Elixir.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nodedown should trigger callbacks

roylez opened this issue · comments

I find when a node is down, neither on_process_exit or on_group_process_exit is triggered. Is this behaviour by design, or am I doing something wrong?

Hi, as per the README, those methods are called only on the node where the process was running. Processes are only monitored on the node they're running on by design.

What is your use case?

I'm facing the same problem. Just think to an IM application. You have chatting between users (which is possible using syn) and you have presence (the other user to know which is online/offline). Presence is not possible because you are not aware when a pid joins/leaves a group.

In my case I forked the repository and add callback on add_to_local_table and remove_from_local_table . This works pretty nice .

To be more explicit:

In ejabberd each user can can connect from multiple instances. Using syn I create a group with the username as the group name and the instances are the pids inside a group. When user "a" wants to send a message to user "b" devices just sends the messages to all the pids inside the group "b". But also I want to know when a user connect disconnect to show this properly when users goes offline/online.

I have a server monitoring client nodes elsewhere. The clients are all in one group, and ideally I could trigger some callbacks when they join/leave the group on the server. As I have read from previous issues that it is by design join callbacks are not implemented because at joining time we have a lot of flexibility to do whatever we want, but lacking an easy way to detect client disconnection on server side is not convenient. In my use case, clients are not expected to disconnect gracefully.

I have a server monitoring client nodes elsewhere. The clients are all in one group

@roylez what do you mean that you are monitoring "client nodes" elsewhere, that are all in one group? I'm not following, since syn does not monitor nodes.

but lacking an easy way to detect client disconnection on server side is not convenient

again, not sure what you mean with "client". Can you clarify?

again, not sure what you mean with "client". Can you clarify?

Sorry I did not make myself clear. Some nodes are made to be the "servers", and they manage other nodes in the "client" group. Clients could connect and disconnect, sometimes without notice, and servers are expected to act accordingly on server side when clients join or leave.

Some nodes are made to be the "servers", and they manage other nodes in the "client" group. Clients could connect and disconnect, sometimes without notice, and servers are expected to act accordingly on server side when clients join or leave.

@roylez, syn registers PIDs, not nodes. Syn cares about processes being alive, not "clients being connected". So unless I understand what you are referring to there isn't much I can do.

Presence is not possible because you are not aware when a pid joins/leaves a group.

@silviucpp you are aware when a process joins because your code is responsible for making a process join a group. You are aware when a PID leaves a group because the appropriate callback is called.

Now, you've raised the question about netsplits in another thread that you then closed to go on with your on version. I've taken a note and will solve this as well, which BTW is a rather rare condition so it shouldn't stop you from doing whatever you are doing.

@ostinelli I think discussion is too complex. There are lot of corner cases that needs to be covered in case the callbacks are missing from syn.

For example :

  1. "you are aware when a process joins because your code " - yes you are right. but without a callback in syn I need to also implement the logic that makes all nodes in the cluster aware about the join. 2x traffic for reinventing the wheel as the entire mechanism already exists in syn.

  2. Netsplit is not that rare when your nodes are geographically distributed.

Anyway I resolved my problem as I told you by adding callbacks inside add_to_local_table and remove_from_local_table and this is also the reason for closing the previous ticket.

"2x traffic"? I love how you open a ticket, close it, create your own solution without considering to participate in a community solution, hi-jack another thread, then assert that with the existing syn implementation your logic would need to "make all nodes in the cluster aware about the join" (why?) and hence double the traffic ¯\_(ツ)_/¯

Let's leave it like this, anyway thank you for your input as I will include some options for improved callbacks.

@roylez, syn registers PIDs, not nodes. Syn cares about processes being alive, not "clients being connected". So unless I understand what you are referring to there isn't much I can do.

yes, indeed syn only care about pids. But when nodes disconnects, pids are removed from local tables as well. In this case, callbacks are not fired--because their nodes are gone. But this is a valid use case, as people may want to do something when this happens.

@ostinelli I don't think it's the place to be rude to each other. I opened a ticket, provided you my use case and closed it because 15 days I didn't got any feedback on it. I thought you are not inserted in adding more callbacks/cover my use case so what was the point on keeping the ticket opened ? It's your project (good job btw) and you are the one deciding what ideas from community you want to incorporate or not. I don't consider I hi-jacked another thread as the subject is more or less on the same subject. on_group_process_exit is a bit pointless for sure in many situations as it is triggered only on the local node where the process was created.

Related to "2x traffic" I was referring strictly to the fact that when a process join/leaves a group you already send this information to all nodes in the cluster. Because in the app I don't have visibility into this via some callbacks from syn I will have to do the same logic into my app.

Maybe I was not clear enough, your argument was "you are aware when a process joins because your code is responsible for making a process join a group." But if we have to follow this argument, my app can very simply do an API call to monitor the process (my app creates the process )so for this reason I consider on_group_process_exit pointless. I think one of the hardest part that syn already covers is to keep the session in sync between nodes. And would be nice if you can provide us callbacks to know when something changes in the sessions.

Another suggestion I have after I tested with 10M sessions on 10 nodes is to avoid doing that spawn for callbacks. This creates lot of overhead with lot of short living processes being created/destroyed when having lot of callbacks. Maybe better will be to use a pool of gen_event ?

@roylez due to your request and other considerations I've been working on syn v3, which defines a new callback mechanism that should address your original ask. These callbacks will be called on every node, not only during standard operations but also during netsplits.

I'll need a little time to complete, but it will be in this new version. Thank you for your suggestion.