sabiwara / enumancer

Elixir macros to effortlessly define highly optimized Enum pipelines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Enumancer

Hex Version docs CI

Elixir macros to effortlessly define highly optimized Enum pipelines.

Warning: Enumancer is an early proof-of-concept with limited functionality, but you can check Iter which offers a more practical and extensive API.

Overview

Enumancer provides a defenum/2 macro, which will convert a pipeline of Enum function calls to an highly optimized tail-recursive function.

defmodule BlazingFast do
  import Enumancer

  defenum sum_squares(numbers) do
    numbers
    |> map(& &1 * &1)
    |> sum()
  end
end

1..10_000_000 |> BlazingFast.sum_squares()  # very fast
1..10_000_000 |> Enum.map(& &1 * &1) |> Enum.sum()  # super slow
1..10_000_000 |> Stream.map(& &1 * &1) |> Enum.sum()  # super slow

There is no need to add Enum., map/2 will be interpreted as Enum.map/2 within defenum/2.

In order to see the actual functions that are being generated, you can just replace defenum/2 by defenum_explain/2 and the code will be printed in the console.

The defenum_explain/2 approach can be useful if you don't want to take the risk of using Enumancer and macros in your production code, but it can inspire the implementation of your optimized recursive functions.

Motivation

Premature optimization is the root of all evil. For most typical use cases, your Enum code performance is going to be good enough, with the bottleneck being I/O anyway (typically your database).

For cases where you actually need the performance, however, Enumancer aims to offer an appealing option which will be

  • faster and using less memory than Enum pipelines, which are building wasteful intermediate lists
  • faster than Stream pipelines which come with a runtime overhead (Enumancer is compile time)
  • more flexible than for comprehensions (and also, faster)
  • easier to write than handcrafted recursive functions, since it looks like an idiomatic Elixir pipeline

See the Case study section for more detailed explanation.

Performance

Enumancer's performance basically relies on four pillars:

  1. memory usage and algorithm: no intermediate lists, just the needed accumulator
  2. specific tail-recursive functions: these are very fast!
  3. compile-time: no runtime overhead needed
  4. inlined anonymous functions: anonymous function application have a noticeable overhead. When directly passing an anonymous function definition (e.g. fn x -> x + 1 end or &foo(&1, :bar)) inside a defenum pipeline, without assigning it to a variable, they will be inlined by the compiler in the generated function definition, removing this overhead

Some benchmarks are available inside the bench folder.

Installation

Enumancer can be installed by adding enumancer to your list of dependencies in mix.exs:

def deps do
  [
    {:enumancer, "~> 0.0.3"}
  ]
end

The documentation can be found at https://hexdocs.pm/enumancer.

Case study

Let's assume we want to sum the square of all odd numbers within a list. We could typically write:

def sum_odd_squares_1(list) do
  list
  |> Enum.filter(&rem(&1, 2) == 1)
  |> Enum.map(& &1 * &1)
  |> Enum.sum()
end

For typical use cases that are not performance sensistive, this will work just fine. But if this is performance critical or needs to work with big lists, this will be highly wasteful: Enum.filter/2 and Enum.map/2 will both generate intermediate lists while we only need to keep an integer as accumulator.

A possibile alternative could be to rewrite this using streams to avoid the intermediate structures:

def sum_odd_squares_2(list) do
  list
  |> Stream.filter(&rem(&1, 2) == 1)
  |> Stream.map(& &1 * &1)
  |> Enum.sum()
end

However, streams come with their own overhead and this might not be that fast in practice: don't be surprised if your code suddenly got 3 times slower!

The better alternative in this case would probably be to use a comprehension:

def sum_odd_squares_3(list) do
  for x <- list, rem(x, 2) == 1, reduce: 0 do
    acc -> acc + x * x
  end
end

But comprehensions can be harder to compose and offer less possibilities than the Enum module. What if you wanted to use Enum.join/2 instead of Enum.sum/1?

Comprehensions with the :reduce option can be also less declarative than the Enum versions: instead of the term sum, you have to explicitly manage an accumulator initiallized to 0.

Finally, the fastest option would be to write a dedicated recursive function optimized for this use case:

def sum_odd_squares_4(list) do
  do_sum_odd_squares_list(list, 0)
end

defp do_sum_odd_squares_list([], acc), do: acc
defp do_sum_odd_squares_list([head | tail], acc) do
  acc =
    if rem(head, 2) == 1 do
      acc = acc + head * head
    else
      acc
    end

  do_sum_odd_squares_list(tail, acc)
end

While this is the best option performance-wise, you would need to sacrifice readability and maintainability, making the tradeoff less attractive.

With the defenum/2 macro, you would just write

defenum sum_odd_squares_5(list) do
  list
  |> filter(&rem(&1, 2) == 1)
  |> map(& &1 * &1)
  |> sum()
end

and this would basically transpile to the previous recursive version.

You get to keep both the declarative and powerful syntax of the 1st version using Enum, and the performance of the most efficient implementation (4th version) using recursion.

Copyright and License

Enumancer is licensed under the MIT License.

About

Elixir macros to effortlessly define highly optimized Enum pipelines

License:MIT License


Languages

Language:Elixir 100.0%