ostinelli / syn

A scalable global Process Registry and Process Group manager for Erlang and Elixir.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Event handler not called

lud opened this issue · comments

Hi, I was trying to see if I could properly shutdown a process on name conflict (I need it to finish work before being killed).

But I cannot have my event handler to be called. (By the way the docs tells that syn:set_event_handler/1 returns ok but that is not the case.

Here is my test code, that I run with two shells as iex --sname alice tt.exs and the "bob" version, and then call H.connect/0.

Mix.install([{:syn, "~> 3.0"}])

defmodule X do
  use GenServer

  def init(_), do: {:ok, []}

  def handle_info(info, state) do
    info |> IO.inspect(label: ~S[info])
    {:noreply, state}
  end

  def terminate(reason, state) do
    reason |> IO.inspect(label: ~S[reason])
    {:noreply, state}
  end
end

defmodule Ev do
  @behaviour :syn_event_handler
  def resolve_registry_conflict(name, pid_a, pid_b) do
    binding |> IO.inspect(label: ~S[resolve_registry_conflict])
    pid_a |> IO.inspect(label: ~S[keep])
    raise "not called"
  end

  def on_process_joined(scope, group_name, pid, meta, reason) do
    binding |> IO.inspect(label: ~S[on_process_joined])
  end
end

defmodule H do
  def connect do
    case node() do
      :"alice@lud-elitebook" -> :"bob@lud-elitebook"
      :"bob@lud-elitebook" -> :"alice@lud-elitebook"
    end
    |> Node.connect()
  end
end

[_ | _] = :syn.set_event_handler(Ev)
:syn.add_node_to_scopes([:spool_scope])
GenServer.start_link(X, nil, name: {:via, :syn, {:spool_scope, :myspool}})

Can you see if I am doing something wrong?

Thank you.

Hello, I don’t seem to see how you are creating a conflict. You’ll need a rather concurrent situation to experience one, which can very rarely be created by hand (I, for one, have never been able to). You can see an example that generally is able to create a concurrency in this test.

BTW on_process_joined/5 is for PG, not registry.

Thank you for your report on the return value, you are indeed correct.

To generate the conflict I just run iex --sname alice tt.exs and iex --sname bob tt.exs in two shells.

This is the output I get in the alice shell:

20:31:30.985 [info]  SYN[alice@lud-elitebook] Adding node to scope <spool_scope>
20:31:30.997 [info]  SYN[alice@lud-elitebook] Creating tables for scope <spool_scope>
20:31:31.005 [info]  SYN[alice@lud-elitebook|registry<spool_scope>] Discovering the cluster
20:31:31.008 [info]  SYN[alice@lud-elitebook|pg<spool_scope>] Discovering the cluster

Then I manually connects the two nodes:

iex(alice@lud-elitebook)1>H.connect()
20:31:47.321 [info]  SYN[alice@lud-elitebook|pg<spool_scope>] Node bob@lud-elitebook has joined the cluster, sending discover message
20:31:47.321 [info]  SYN[alice@lud-elitebook|registry<spool_scope>] Node bob@lud-elitebook has joined the cluster, sending discover message
20:31:47.321 [info]  SYN[alice@lud-elitebook|pg<spool_scope>] Received DISCOVER request from node bob@lud-elitebook
20:31:47.321 [info]  SYN[alice@lud-elitebook|registry<spool_scope>] Received DISCOVER request from node bob@lud-elitebook
20:31:47.322 [info]  SYN[alice@lud-elitebook|registry<spool_scope>] Received ACK SYNC (1 entries) from node bob@lud-elitebook
true
iex(alice@lud-elitebook)2> 
20:31:47.322 [info]  SYN[alice@lud-elitebook|pg<spool_scope>] Received ACK SYNC (0 entries) from node bob@lud-elitebook
20:31:47.330 [info]  SYN[alice@lud-elitebook|registry<spool_scope>] Registry CONFLICT for name :myspool: {#PID<10953.198.0>, :undefined} vs {#PID<0.198.0>, :undefined} -> keeping remote: #PID<10953.198.0>

And as you can see the conflict is resolved but my handler is not called.

Got it, thank you for your clarification. The issue is in the callback specifications docs which are wrong. The method is resolve_registry_conflict/4 (arity of 4, not 3).

I've just pushed a v3.0.1 to cover both for the ok return value and the wrong specs. See new docs here:
https://hexdocs.pm/syn/syn_event_handler.html#c:resolve_registry_conflict/4

Let me know if this solves your issue.

Thank you very much! I think now I may have found a bug. If you use my updated code below, start the two shells, call X.start_link on both shells, and then call H.connect, it looks like the name resolution and the event handler run properly but none of the processes are killed.

Mix.install([{:syn, "~> 3.0.1"}])

defmodule X do
  use GenServer

  def init(_) do
    :timer.send_interval(1000, :print_node)
    {:ok, []}
  end

  def handle_info(:print_node, state) do
    node() |> IO.inspect(label: ~S[node()])
    {:noreply, state}
  end

  def handle_info(info, state) do
    info |> IO.inspect(label: ~S[info])
    {:noreply, state}
  end

  def terminate(reason, state) do
    reason |> IO.inspect(label: ~S[reason])
    {:noreply, state}
  end

  def handle_call(request, _from, state) do
    {:reply, {:echo, request}, state}
  end

  # def name, do: {:global, :myspool}
  def name, do: {:via, :syn, {:spool_scope, :myspool}}
  def whereis, do: GenServer.whereis(name())
  def start_link, do: GenServer.start_link(__MODULE__, nil, name: name())
end

defmodule Ev do
  @behaviour :syn_event_handler
  def resolve_registry_conflict(scope, name, pid_a, pid_b) do
    binding() |> IO.inspect(label: ~S[resolve_registry_conflict])
    pid_a |> elem(0) |> IO.inspect(label: ~S[keep])
  end
end

defmodule H do
  def connect do
    case node() do
      :"alice@lud-elitebook" -> :"bob@lud-elitebook"
      :"bob@lud-elitebook" -> :"alice@lud-elitebook"
    end
    |> Node.connect()
  end
end

:ok = :syn.set_event_handler(Ev)
:syn.add_node_to_scopes([:spool_scope])

This is intended behavior.

If implemented, this method MUST return the pid() of the process that you wish to keep. The other process will not be killed, so you will have to decide what to do with it. If the custom conflict resolution method does not return one of the two Pids, or if the method crashes, none of the Pids will be killed and the conflicting name will be freed.

https://hexdocs.pm/syn/syn_event_handler.html#content

Isn't this what you originally desired, to have a graceful shutdown of the discarded process? :)

Thank you for your report!

Oh indeed, that will let me do what I want, so everything is fine. Thank you.

Isn't this what you originally desired, to have a graceful shutdown of the discarded process? :)

Absolutely ;)

Hi again. It seems to work but then GenServer.whereis(X.name()) returns nil on one of the two nodes (generally the one where the process was stopped, but not always) although the nodes are connected. X.name is {:via, :syn, {:spool_scope, :myspool}}.

It’s kind of hard to follow you in those manual experiments. I don’t know the sequence and timings of your operations. There is a comprehensive test suite that covers registry resolution. If you think to have found a bug, please write a test to reproduce it and open a new issue so that I can address it. The existing tests provide various helpers for distributed testing. Thank you.

Sure, I understand, I'll try to make you a test.

Otherwise, a small easy example will do as well :)

Please note though that if you are using the code above it's normal that you are experiencing inconsistencies:

Important Note: the conflict resolution method will be called on the two nodes where the conflicting processes are running on. Therefore, this method MUST be defined in the same way across all nodes of the cluster and have the same effect regardless of the node it is run on, or you will experience unexpected results.

https://hexdocs.pm/syn/syn_event_handler.html

In the code above, you're always returning pid_a which is not the same on every node, so your method returns different results depending on which node it is run, and this creaks the consistency contract.

As an additional note: if you need to do some cleanup that doesn't have to be in the process itself, you may consider using on_process_unregistered/5 instead, so that you don't have to mess with registry resolution yourself.