dashbitco / flow

Computational parallel flows on top of GenStage

Home Page:https://hexdocs.pm/flow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unhelpful Error Message

TylerPachal opened this issue · comments

Hello, and sorry for the vague PR title.

I have the following flow, which results in the error below.

Flow:

window = Flow.Window.global() |> Flow.Window.trigger_every(5)

0..99
|> Flow.from_enumerable(window: window)
|> Flow.map(fn number -> Process.sleep(100); IO.inspect(number, label: :number) end)
|> Flow.group_by(fn i -> rem(i, 2) == 0 end)
|> Flow.map(fn batch -> Process.sleep(100); IO.inspect(batch, label: :batch) end)
|> Flow.run()

Output:

number: 0
number: 1
number: 2
number: 3
number: 4
batch: {false, [3, 1]}
batch: {true, [4, 2, 0]}
number: 5

10:22:25.684 [error] GenServer #PID<0.193.0> terminating
** (BadMapError) expected a map, got: [false: [3, 1], true: [4, 2, 0]]
    (elixir 1.12.3) lib/map.ex:623: Map.update([false: [3, 1], true: [4, 2, 0]], false, [5], #Function<36.94148943/1 in Flow.group_by/3>)
    (flow 1.1.0) lib/flow/materialize.ex:643: Flow.Materialize."-build_reducer/2-lists^foldl/2-0-"/3
    (flow 1.1.0) lib/flow/materialize.ex:643: anonymous fn/5 in Flow.Materialize.build_reducer/2
    (flow 1.1.0) lib/flow/materialize.ex:553: Flow.Materialize.maybe_punctuate/10
    (flow 1.1.0) lib/flow/map_reducer.ex:59: Flow.MapReducer.handle_events/3
    (gen_stage 1.1.2) lib/gen_stage.ex:2471: GenStage.consumer_dispatch/6
    (gen_stage 1.1.2) lib/gen_stage.ex:2660: GenStage.take_pc_events/3
    (stdlib 3.13.2) gen_server.erl:680: :gen_server.try_dispatch/4
Last message: {:"$gen_consumer", {#PID<0.192.0>, #Reference<0.1202714464.28835843.142742>}, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, ...]}
State: {%{#Reference<0.1202714464.28835843.142742> => nil}, %{done?: false, producers: %{#Reference<0.1202714464.28835843.142742> => #PID<0.192.0>}, trigger: #Function<2.2490666/3 in Flow.Window.Global.materialize/5>}, {0, 12}, {5, %{}}, #Function<1.2490666/4 in Flow.Window.Global.materialize/5>}

If I change the last line of my flow from Flow.run() to Enum.to_list()then everything works fine, but this is not an option for me as I am dealing with an unbounded stream in my real code. I went and took a quick look at lib/flow/materialize.ex:643 but it wasn't clear to my why a Map was needed, or how changing Flow.run to Enum.to_list() would prevent this from happening.

Version of flow is 1.1.0 and gen_stage is 1.1.2.

Hi @TylerPachal! Group_by uses a map internally and i assume that's why it is failing. I am not sure if you can have a .map after group_by, and that may be why it fails. I will investigate!

Yes, my assessment above is correct. Flow.map/2 after a reduce changes the reduce state which changes the group_by state. We were used to have Flow.map_state/2 before, but I believe that had the same bug. I have deprecated map after reduce for now. You can use on_trigger instead.

Thanks!