Discuss how to approach macro-heavy libraries
wende opened this issue · comments
In Elixir a lot of libraries rely on macros and while these used inside functions can be easily mimicked and wrapped, when it comes to those as top scope definitions (right inside modules f.i) it starts to become quite unwieldy
One of the simplest examples is how to write a test. Let's say we've got a standard ExUnit.Case macro.
defmodule MyTest do
use ExUnit.Case
test "I want to test something" do
assert 1 == 1
end
end
Right now there is no way to express that in Elchemy.
And while it is possible to do some tricks like
module MyTest do
{- ex
use ExUnit.Case
test "my_test" do
assert execute_test()
end
-}
executeTest : Bool
executeTest =
1 == 1
end
While this is correct code that would work i'd say it's quite far from what we would call an elegant solution.
This issue's purpose is to discuss all the possibilities and come out with something most that solves most problems without causing too much over-complication
I'm currently working on a proof of concept implementation of a Plugin
approach to this.
Entire idea is that instead of adding macros to modify Elchemy AST (which would just end up being a type-safety hazardous mess) you get a toolset to create new modules manipulating Elixir AST.
So that from user-point of view calling
module SomeTestSuite exposing (suite)
import Elchemy.ExUnitPlugin
suite =
-- This will create a module using ExUnit.Case
ExUnitPlugin.suite {someField = "mycontext"} -- <- this is a starting context passed further
-- This will tell it to define a setup inside
|> ExUnitPlugin.setup (\ {someField} -> {ctx = someField} )
-- This will tell it to define a test inside
|> ExUnitPlugin.test "1 + 1 equals 2" (\context -> 1 + 1 == 2 )
Which could then be used normally wherever desired
defmodule MyProjectTest do
use ExUnit.Case
SomeTestSuite.suite()
end
Which is already proven to work
Under the hood however it needs to create a module and INJECT an anonymous function into it's body.
So our example would actually resolve to (AST manipulation apart):
defmodule SomeTestSuite do
# It's actually using `Module.create` but using defmodule for better readability here
def suite() do
defmodule ExUnitPlugin.H(ContentHash) do
use ExUnit Case
test "2 + 2 equals 1", context do
assert unquote(f).()
end
end
end
The entire problem however, is how to smuggle the anonymous function inside the created module.
We can't unquote it because anonymous functions can't be unquoted. I obviously want to avoid polluting the global state by setting some Processes/Agents/Whatever by all costs.
Reading on Elixir forum I found this (which I wasn't aware of before)
@OvermindDL1
An anonymous function is actually a public function in the module that the anonymous function is defined in, and a call to it 'essentially' just {ModuleName, #, [args]} (a bit different in reality, but this gets the point across).
But I wasn't able to find a way to call such a function.
@OvermindDL1 Can you elaborate on how that works and if it's possible to call an anonymous function just by having it's value passed (somehow) to the nested module?
Ok. Found :erlang.fun_info
looks like exactly what I need
:erlang.fun_info(Test.test)
[
pid: #PID<0.98.0>,
module: Test,
new_index: 0,
new_uniq: <<64, 160, 30, 242, 89, 33, 1, 230, 243, 33, 207, 54, 132, 229, 36,
142>>,
index: 0,
uniq: 33882359,
name: :"-test/0-fun-0-",
arity: 1,
env: [],
type: :local
]
Hmm. It might be no use after all. The functions are obviously local so I don't think it's possible to pass them further
Heh, you can pass them around just fine. An anonymous function is just, essentially internally, a tuple of the module name, the function index in the module function array, and a set of bindings to pass in to it as it's environment (which become part of its called arguments).
To 'pass it in', if you don't need an environment, is easy, you can just remotely call it, otherwise maybe unwrap the environment in to it.
However, I'm not quite understanding what you are trying to do, let me read in more detail...
Are you talking about the |> ExUnitPlugin.test "1 + 1 equals 2" (\context -> 1 + 1 == 2 )
bit needing to generate assert unquote(f).()
? If so then you could do it if the f
is the AST of the function wrapped in an anonymous function?
And don't forget, even macro-heavy libraries are still just only function calls, everything is function calls. :-)
If you approach it more like Lisp instead of metaprogramming like in other languages, then that is much closer to how Elixir really is. :-)
@OvermindDL1 Obviously. But I want to give a [relatively] simple tool for people from outside to be able to write wrappers for common libraries like they would normally do it.
With macros:
Test name body ->
Application (atom "test")
[ Value (String name)
, Variable "context"
, Do
-- A Quotation here is a place we will paste our function in
[ dotApplication Quotation [ Variable "context" ]
]
]
|> unquote body
Instead of having to understand all of the internals of how ExUnit works and just do all of the macro work manually.
What I would want to do is to be able to build AST "around" anonymous functions.
So for instance with a simpler example than ExUnit.
Let's say we want to define a module implementing a bevahiour.
So that would resolve to something like
defmodule MyCallingModule do
def behaviour_implementation(f_to_call) do
defmodule Whatever do
@behaviour MyBehaviour
def some_callback() do
# Here somehow we need to call any function inside our parent, but since in Elm a global function and a local function are virtually the same I need to be able to pass an anonymous function here
f_to_call.()
end
end
end
end
@OvermindDL1 What do you mean remotely call it?
Can you give an example of how to serialize (into AST) an anonymous function and to be able to execute it after rebuilding from serialized information about it?
Can you give an example of how to serialize (into AST) an anonymous function and to be able to execute it after rebuilding from serialized information about it?
What format is the anonymous function in, is it the AST of it, like fn x -> x end
or so?
If you have no environment that needs to be included then just pack it as, say, {modulename, functionname, [args...]}
and then apply
that where needed?
@OvermindDL1
The function is being passed from Elchemy so we can't be sure it is specified as a function body when called and not a variable. Although it can be
ExUnitPlugin.test (\_ -> 10 + 1 == 11)
it might as well be
let
f = (\_ -> 10 + 1 == 11)
in
ExUnitPlugin.test f
So quoting it on call inside Elchemy isn't really an option. Especially that it wouldn't behave consistent with other functions due to "lazyness" of macros.
The problem with using apply
here is that even if there is an anonymous function
lets say
defmodule Test do
def myfunction(), do: fn x -> x + 1
end
Even though I can get the name of the anonymous function by doing
Test.myfunction() |> :erlang.fun_info()
[
pid: #PID<0.98.0>,
module: Test,
new_index: 0,
new_uniq: <<64, 160, 30, 242, 89, 33, 1, 230, 243, 33, 207, 54, 132, 229, 36,
142>>,
index: 0,
uniq: 33882359,
name: :"-test/0-fun-0-",
arity: 1,
env: [],
type: :local
]
That would still make this function local so I couldn't just call it with
apply(Test, :"-test/0-fun-0-", [1])
Actually if it might be an anon function or it might be a binding to an anon function, there is an easy way to handle that. :-)
# All of these do the same thing:
apply(fn -> 42 end, []) # returns 42
f = fn -> 42 end
apply(f, []) # returns 42
defmodule SomewhereElse do
def bloop(), do: 42
end
apply(&SomewhereElse.bloop/0, []) # returns 42
Apply is an 'indirect' call, not super fast (but doesn't matter most of time).
A remote call is something like Blah.bloop()
, and a local call is something like bloop()
, and for note a binding call like f.()
is an indirect call too (though with sometimes slightly more optimized VM assembly compared to apply
, though not always).
And yes, the apply/3
is 'slightly' faster than apply/2
since there is no environment for it to worry about, but you can't always use that one unlike apply/2
, which you can always use.
@OvermindDL1 I don't think we're on the same page :-)
What I need is to be able to call an anonymous function inside of a dynamically declared module, but a function passed from the outside of this module, but also not as an argument but a compile time (which is fortunately run-time too, because it's dynamically declared module)
Example
defmodule OuterScope do
def declare_module(my_fun) do
defmodule InnerScope do
def execute(), do: my_fun.()
end
end
end
But the module inside is a quoted expression. And you can't pass anonymous functions to quoted expressions
Hmm, that code I don't think will do what you expect, you can't really declare a module inside a function, and doing so will work when the compiler exists but will fail in releases.
@OvermindDL1
Oh. I haven't thought about that. Releases definitely would be a problem here...
I guess that kills my proof of concept here.
Have you got any other idea on how to use module level macros without polluting the scope of the modules?
For instance how one might go about writing an Elchemy wrapper for this:
defmodule Hello.User do
use Ecto.Schema
schema "users" do
field :bio, :string
field :email, :string
field :name, :string
field :number_of_pets, :integer
timestamps()
end
end
I thought that defining dynamically a module implementing those might be the best idea, but now I see it's probably not.
I'm starting to slowly reconsider returning the meta
function definition in Elchemy modules to be able to add a code to the module
For instance how one might go about writing an Elchemy wrapper for this:
I'd probably just keep them as function calls.
I don't think that elm has top-level calling like better languages like OCaml and such, so let's give the 'top level' calling scope a special name, how about __top__
, then maybe this?
__top__ =
let schema = remote("Ecto.Schema.schema", 2) in
let field = remote("Ecto.Schema.field", 2) in
schema "users" {do: [
field Bio String,
field Email String,
field Name String,
field Number_of_pets Integer,
remote("Ecto.Schema.timestamps", 0)
]}
Or whatever you want, the important bit is not necessarily treating it as calls directly, but recognizing that the AST is what it is important to Macro's, and a do
body is really a __block__
of a list of AST elements as one example. :-)
@OvermindDL1
Yeah. So basically what meta
function used to do, but I've gotten many negative opinions about this approach from elm-core team.
Thank you for suggestions. Will have to take a moment to think about it then
The elm-core
team are also very haskell'y, they like their purity even if it means being less productive, look at Ports after all. ;-)
/me still thinks wende should instead compile OCaml down to Elixir instead... almost identical syntax, but has 'more' and far far better designed
@OvermindDL1 First of all - sunk cost fallacy
Second of all, OCaml has one problem. It's hard to read. People like Elm because it's stupid simple (until it isn't, but well)
Plus I despise lowercase types
OCaml is hard to read? I'd love for you to show examples in Elm that I could translate into OCaml that is hard to read. ;-)
Ah but lowercase types matches Elixir/Erlang too! ;-)
@OvermindDL1 Honestly it might be just me not being used to the syntax.
Nevertheless together with community we can fix Elm haha ;-)
Plus I don't want to get into competition with guys from Alpaca Lang. They're doing a great job
@OvermindDL1 Honestly it might be just me not being used to the syntax.
Remember, Elm's syntax was based on OCaml's, in fact it is so close to a direct translation that changing a couple ,
's into ;
and changing the haskell'y ambiguous multi-let's into unambiguous single let's gets it to be valid OCaml. ^.^
Nevertheless together with community we can fix Elm haha ;-)
Heh, good luck there, I've seen how extremely hostile they are to suggestions (without unemotional reasoning as to their stances), it very much reminds me of the hostile and antagonistic Haskell community...
Plus I don't want to get into competition with guys from Alpaca Lang. They're doing a great job
A couple things though:
- I think they are going about it the wrong way by not black boxing messages. By typing processes they make it much harder to upgrade and reason about things.
- They are lacking a massive amount of capabilities that OCaml just has out of the box (first class modules, which the BEAM VM supports just fine, GADT's, Polymorphic Variants, extensible variants, etc... etc...).
- Lacking the macro capabilities that OCaml has (not 'quite' as easy to 'make' as Elixir macro's, but just as powerful and easy to use).
- And it lacks the immense tooling that the OCaml ecosystem has (seriously, it puts Elm to absolute shame).
As well as OCaml has the fastest optimizing compiler of any compiled language in existence. :-)
As well as OCaml is designed to be extended. It even has 2 backends to compile to javascript (one is more readable, one is lower level but more powerful). So you could use the same language for the BEAM, javascript, and native compilation. ;-)
I've seen how extremely hostile they are to suggestions
Oh yes. Trust me, I'm aware
And it lacks the immense tooling that the OCaml ecosystem has
My impression was always the exact opposite. Before Reason came out I felt incredibly intimidated by OCaml's errors and general tooling. Definitely not less than Haskells though, that's true.
It's been a long time since I last tried it. Might be I should give it another try sometime
My impression was always the exact opposite. Before Reason came out I felt incredibly intimidated by OCaml's errors and general tooling. Definitely not less than Haskells though, that's true.
Reason has to date not created anything 'new', but has packaged some well known things together, like BetterErrors
(a common OCaml package to make errors more readable), merlin
(OCaml's language server), etc... These are usually all added in most ocaml packages straight out anyway. ;-)
It doesn't include the indenter of ocp-indent
because they just convert back and forth between reason and ocaml anyway (oh, and their reformatter is in the OCaml packaging system opam
too). I'm not terribly a fan of full formatters, I tend to prefer indenters as then it doesn't lose context-sensitive formatting cues that formatters lose.
OCaml has a couple build systems, ocamlbuild is the usual 'default' old one, but most people use JBuilder as it's just better, and in fact so many people use JBuilder that the next OCaml version is having it be the default build system under a new name of Dune.
It's been a long time since I last tried it. Might be I should give it another try sometime
If you aren't on windows it's trivial to setup and install (I just sudo apt install opam
then use opam, just like Rusts Cargo, to choose whatever versions and styles of the compiler I want, packages, etc...). On Windows you have to actually manually download opam and such but there are pre-setup full packages already that include cygwin as well for handling native stuff (though they are fixing up opam to run native on windows currently, might be in next version actually). :-)
And all of these tools it has have existed for years. ^.^
So after experimenting with it a little there's a summary of pitfalls using a top/meta function:
Meta definition resolving to code injected into the top scope
Pitfalls:
-
Problem: No such thing as
do
blocksSolution: Use a Do (List x)- Problem: Lists in Elm can't contain two different types in them
- Solution: Don't use blocks at all (wrap them anywhere outside)
- Solution: Add a special opaque type
Macro
that can be put only into the meta tag.
-
Problem: Can't use other functions in the module
This is a big one. I'm afraid it's gonna be to hard to explain why you can't just do:
meta =
something myfun -- error
myfun = 1
And you can't because
something(myfun()) # No such thing as myfun yet
defp myfun(), do: 1
And that feels really bad, unless explicitly explained but that's an additional layer of complexity I'd really prefer to avoid.
A good example on how to present it would be a PoC showing a test suite using ExUnit, with tests (most elegantly generated from an existing elm suite.
Which would mean a simple and clean non-compiler-solution to turning this
module UnitTests exposing (..)
import Test exposing (..)
import Expect
all : Test
all =
describe "Test"
[ test "Test truth" <|
\() ->
Expect.equal "a" "a"
]
and this
module MyAppTest do
use ExUnit.Case
UnitTests.all()
end
Into something that behaves like this:
module MyAppTest do
use ExUnit.Case
describe "Test"
test "Test truth" do
assert "a" == "a"
end
end
end
Because the first solution that comes to my mind would be:
module UnitTests exposing (..)
import Test exposing (..)
import Expect
meta =
all |> turnToMacro
all : Test
all =
describe "Test"
[ test "Test truth" <|
\() ->
Expect.equal "a" "a"
]
But this won't work because all
is not known by at the point when meta
is invoked
Much closer to the solution, however there is one caveat of readability:
When calling a macro in Elixir, you can't really know just by looking at the code if it's a macro or a function. Which has a problem of code as value
vs value of the code
Looking at this example:
b = a + 1
function(a + 1) == function(b)
We can be sure the output will be always true, however
b = a + 1
macro(a + 1) == macro(b)
Here we cannot. Because a + 1
and b
are not the same code-wise (they produce different AST)
Which causes a state of a slight impurity.
That leads to a problem that:
test "MyTest" \() -> assert True
would work, while
myTest () = assert True
test myTest
wouldn't.
The only solution I can think of is introducing a new keyword/type/function to Elchemy that would explicitly denote passing the code as an AST rather than it's value (basically what quote
does in Elixir)
It could be for example Do
So that when we see:
Do <| 10 + 20
It's not yet evaluated, but passed as as an AST
This way we could know straigh away that
function_or_macro(a + 1) -- This is a function
function_or_macro(Do <| a + 1) -- this is a macro
The problem is it introduces a certain layer of complexity that I'd like to avoid.
All suggestions welcome. Otherwise I'll release the Do
solution onto the dev
branch and see it work for a while
Instead of a Do
like thing, what about using a special prefix operator to 'call'/inject a macro result? :-)