Unable to checkout db connection for Broadway

Question

Unable to checkout db connection for Broadway

michaelst opened this issue 3 years ago · comments

As far as I can tell I have followed the docs correctly to checkout the db connection and allow Broadway to use it. However I keep getting a DBConnection.OwnershipError

13:39:30.453 [basic_console_logger] [error] ** (DBConnection.OwnershipError) cannot find ownership process for #PID<0.606.0>.

Here is some of the code for that test.

{:ok, pid} = PubSubBackup.Consumer.start_link(name: __MODULE__)
Sandbox.allow(Repo, self(), pid)

ref = Broadway.test_batch(pid, [message1, message2, message3])
assert_receive {:ack, ^ref, [_, _, _] = data, []}, 2000

When I look up that process in the process list it shows this

[
     registered_name: PubSubBackup.ConsumerTest.Broadway.BatchProcessor_default_0,
     current_function: {:gen_server, :loop, 7},
     initial_call: {:proc_lib, :init_p, 5},
     status: :waiting,
     message_queue_len: 0,
     links: [#PID<0.605.0>],
     dictionary: [
       "$ancestors": [PubSubBackup.ConsumerTest.Broadway.BatchProcessorSupervisor_default,
        PubSubBackup.ConsumerTest.Broadway.BatcherSupervisor_default,
        PubSubBackup.ConsumerTest.Broadway.BatchersSupervisor,
        PubSubBackup.ConsumerTest.Broadway.Supervisor,
        PubSubBackup.ConsumerTest, #PID<0.572.0>],
       "$initial_call": {GenStage, :init, 1},
       rand_seed: {%{
          bits: 58,
          jump: #Function<3.47293030/1 in :rand."-fun.exsplus_jump/1-">,
          next: #Function<0.47293030/1 in :rand."-fun.exsss_next/1-">,
          type: :exsss,
          uniform: #Function<1.47293030/1 in :rand."-fun.exsss_uniform/1-">,
          uniform_n: #Function<2.47293030/2 in :rand."-fun.exsss_uniform/2-">
        }, [235498787232554556 | 20651838642852265]}
     ],
     trap_exit: true,
     error_handler: :error_handler,
     priority: :normal,
     group_leader: #PID<0.64.0>,
     total_heap_size: 986,
     heap_size: 376,
     stack_size: 12,
     reductions: 299,
     garbage_collection: [
       max_heap_size: %{error_logger: true, kill: true, size: 0},
       min_bin_vheap_size: 46422,
       min_heap_size: 233,
       fullsweep_after: 65535,
       minor_gcs: 1
     ],
     suspending: []
   ]

José Valim · Answer 1 · Sat Feb 27 2021 05:03:12 GMT+0800 (China Standard Time)

That won’t work because Broadway returns the root of its supervision tree, not the processes doing the actual work. You will have to mark your Broadway tests as sync for now.

Stefan Chrobot · Answer 2 · Tue Mar 23 2021 19:37:46 GMT+0800 (China Standard Time)

What would be required to support async Broadway tests that make use of SQL Sandbox? Maybe one way to do it would be to publicly expose Topology.process_name(s) so that the tests could allow them all.

José Valim · Answer 3 · Tue Mar 23 2021 20:20:09 GMT+0800 (China Standard Time)

@stefanchrobot I would investigate supporting the $callers API.

Michael St Clair · Answer 4 · Tue Mar 23 2021 21:40:06 GMT+0800 (China Standard Time)

What I ended up implementing was passing a function into the context of start_link so we can call allow in the pipeline.

    setup context do
      self = self()

      allow = fn pid ->
        Sandbox.allow(Genesis.Repo, self, pid)
      end

      {:ok, _pid} = Consumer.start_link(name: context.test, context: %{allow: allow})

      :ok
    end

Stefan Chrobot · Answer 5 · Sat Mar 27 2021 02:46:07 GMT+0800 (China Standard Time)

@josevalim I'd like to take stab at this. Is $callers API something that has some sort of spec or should I just look at Task as a reference implementation? Should the change happen somewhere around Topology or should I have a look at applying this to GenStage?

José Valim · Answer 6 · Sat Mar 27 2021 03:15:40 GMT+0800 (China Standard Time)

It is definitely a Broadway thing. We should pass the caller as part of the message metadata. Then inside handle_message and handle_batch we look at this metadata and set the caller in the process dictionary accordingly and then revert it.

Then in all test messages we include the relevant caller metadata.

Stefan Chrobot · Answer 7 · Thu Apr 08 2021 06:33:16 GMT+0800 (China Standard Time)

@josevalim

We should pass the caller as part of the message metadata.

Should the caller metadata be included in all messages or only those pushed via test_message and test_batch?

José Valim · Answer 8 · Thu Apr 08 2021 14:36:37 GMT+0800 (China Standard Time)

@stefanchrobot only on test_message/test_batch IMO.

José Valim · Answer 9 · Tue Aug 24 2021 21:33:31 GMT+0800 (China Standard Time)

I have struggled to support this in Broadway out of the box but I believe I have found a reasonable way to enable this with the tools available today. When you send a test message, you can include additional metadata:

Broadway.test_message(MyPipeline, message, metadata: %{caller: self()})

Now you can use the telemetry events, that run on each process, to customize the ownership:

# In your test/test_helper.exs
defmodule BroadwayEctoSandbox do
  def attach(Repo) do
    events = [
      [:broadway, :processor, :start],
      [:broadway, :batch_processor, :start],
    ]

    :telemetry.attach_many({__MODULE__, repo}, events, &handle_event/4, %{repo: repo})
  end

  def handle_event(_event_name, _event_measurement, %{messages: messages}, %{repo: repo}) do
    with [%Broadway.Message{metadata: %{caller: caller}} | _] <- messages do
      Ecto.Adapters.SQL.Sandbox.allow(repo, caller, self())
    end

    :ok
  end
end

BroadwayEctoSandbox.attach(MyRepo)