ocaml / dune

A composable build system for OCaml.

Home Page:https://dune.build/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dune variants - how they are supposed to work and scale?

hannesm opened this issue · comments

First of all, thanks for your work on dune! This is an amazing piece of work.

I recently wanted to remove a lot of functors from MirageOS, since they're not really necessary -- we have at runtime only one thing around anyways (i.e. the network interface for xen will always be one thing -- while the network interface when compiling for unix will be something else). This sounded to me like a perfect use case for your dune variant feature. To a small scale it seemed to work fine, but once I "variantified" the TCP/IP stack, I get not very nice results.

Expected Behavior

I'd expect from a dune file with an executable statement that depends on some variants that those will be used across the compilation.

Actual Behavior

File "duniverse/mirage-net/unix/dune", line 4, characters 52-60:
4 |  (libraries common logs macaddr cstruct cstruct-lwt lwt.unix tuntap)
                                                        ^^^^^^^^
Error: Library "lwt.unix" in _build/solo5/duniverse/lwt/src/unix is hidden
(optional with unavailable dependencies).
-> required by library "mirage-net.unix" in
   _build/solo5/duniverse/mirage-net/unix
-> required by library "tcpip.icmpv4-direct" in
   _build/solo5/duniverse/mirage-tcpip/src/icmp/direct

So, instead of the variant specified in the executable statement, the default variant is used. :(

Reproduction

Unfortunately, this is rather big at the moment, but let me try:

With all of that, I try to compile the "device-usage/network" unikernel from https://github.com/hannesm/mirage-skeleton/tree/variants

So, the step(s) to reproduce is:

  • install a fresh switch (OCaml 4.14.2)
  • opam pin the mirage utility
  • clone the mirage-skeleton repository from my branch above
  • opam pin add -n for the above list
  • cd mirage-skeleton/device-usage/network ; mirage configure -t solo5 ; make lock pull build

Variant specifications

At the moment, I specify the default_implementation to be "mirage-net.unix", but in the generated dune file, there is: (libraries ... mirage-net.solo5 ....

I have no idea, why, as seen above, dune thinks that tcp.icmpv4-direct requires a mirage-net implementation at all, and furthermore how it decides to try the .unix one.

Debug avenues - remove default_implementation

I tried to remove the default_implementation from the above mentioned packages. This results in the following error:

dune build --profile release --root . ./dist
Error: No implementation found for virtual library "mirage-net" in
_build/solo5/duniverse/mirage-net/src.
-> required by library "ethernet" in _build/solo5/duniverse/ethernet/src
-> required by library "tcpip.ipv4" in
   _build/solo5/duniverse/mirage-tcpip/src/ipv4/interface
-> required by library "tcpip.icmpv4-direct" in
   _build/solo5/duniverse/mirage-tcpip/src/icmp/direct
-> required by executable main in dune.build:11
-> required by _build/solo5/main.exe
-> required by _build/solo5/network.hvt
-> required by _build/solo5/dist/network.hvt
-> required by alias dist/all (context solo5)
-> required by alias dist/default (context solo5)

Still, mirage-net.solo5 is a dependency in the dune file.

Move all default_implementations to those I want to use

Yes, this indeed works. But it is pretty unpleasant (would need a shell script for postprocessing), since I had hoped the variant feature would exactly provide me with this:

  • I define virtual libraries and default implementation
  • My library code uses virtual libraries
  • The only thing is at top-level (linking time, where I specify the executable) selecting the concrete implementation.

Quo vadis?

So, any pointers how I can debug this issue further? Or do I hit an intended limit of dune variants that I was not aware of?

Thanks a lot for your time reading this issue. 🐫 🐫

FWIW, the full dune file:

;; Generated by mirage.v4.5.0-19-gb4b343f

(copy_files# ./mirage/main.ml)

(copy_files ./mirage/manifest.json)

(copy_files# ./mirage/manifest.ml)

(executable
 (enabled_if (= %{context_name} "solo5"))
 (name main)
 (modes (native exe))
 (libraries arp.mirage duration ethernet lwt mirage-bootvar-solo5
   mirage-clock.solo5 mirage-crypto-rng-mirage mirage-logs
   mirage-net.solo5 mirage-runtime mirage-runtime.network mirage-solo5
   mirage-time.solo5 tcpip.icmpv4 tcpip.icmpv4-direct tcpip.ipv4
   tcpip.ipv4-direct tcpip.ipv4v6 tcpip.ipv4v6-direct tcpip.ipv6
   tcpip.ipv6-direct tcpip.stack tcpip.stack-direct tcpip.tcp
   tcpip.tcp-direct tcpip.udp tcpip.udp-direct)
 (link_flags :standard -w -70 -color always -cclib "-z solo5-abi=hvt")
 (modules (:standard \ config manifest))
 (foreign_stubs (language c) (names manifest))
)

(rule
 (targets manifest.c)
 (deps manifest.json)
 (action
  (run solo5-elftool gen-manifest manifest.json manifest.c)))

(rule
 (target network.hvt)
 (enabled_if (= %{context_name} "solo5"))
 (deps main.exe)
 (action
  (copy main.exe %{target})))

And the dune file of the tcpip.icmpv4-direct:

(library
 (name tcpip_icmpv4_direct)
 (public_name tcpip.icmpv4-direct)
 (implements tcpip.icmpv4)
 (instrumentation
  (backend bisect_ppx))
 (private_modules icmpv4_packet icmpv4_wire)
 (libraries logs ipaddr tcpip.checksum tcpip.ipv4))

(no occurence of "mirage-net" there -- not clear to me why that causes the issue)

I can investigate the issue, but the example has to be simplified further. Try to shave off all the unnecessary dependencies in your repro

Thanks Rudi. I've a much better intuition and will shortly follow up with a much smaller case.

So, a smaller case is at: https://github.com/hannesm/dune-variants-test

(1) We have dune variant a and b (in a.opam and b.opam).
(2) The package a has a default implementation a.a depending on a non-existing package magic. The other implementation, a.b, does not have this dependency.
(3) The package b has a default implementation b.a - which depends on a. The implementation b.b does not depend on a.
(4) The executable depends on a.b and b.

In (4), my expectation is that since I chose a.b, this will be used (and not the default implementation (a.a) while building b). Does my expectation make sense?

Here, if you dune build exe/main.exe, I see the following error:

File "a/impl_a/dune", line 4, characters 13-18:
4 |   (libraries magic)
                 ^^^^^
Error: Library "magic" not found.
-> required by library "a.a" in _build/default/a/impl_a
-> required by library "b.a" in _build/default/b/impl_a
-> required by executable main in exe/dune:2
-> required by _build/default/exe/main.exe

I have to admit, at the end of the day, I observed the following intuition:

  • variants in mirage-net exist (and a default implementation)
  • I explicitly selected mirage-net.solo5 in the executable dune file
  • another package, also being a virtual library, depends on mirage-net
  • and here it made a difference whether the interface had the (libraries mirage-net) (virtual_modules a) dependency (which is the case that works);
  • or the implementation (being used) had the (libraries mirage-net) (implements a) dependency. in this case, the default_implementation from mirage-net was used, and not the mirage-net.solo5 as specified in the dune file of the executable.

Unfortunately today I was not able to reconstruct this test case (yet?).

Managed to confirm your repro. The issue is that since default implementations must live in the same package as the virtual libraries, we need to resolve more eagerly than normal libraries. Thus the error that the magic library is missing is coming from that check. Even though you don't need the default implementation as you're correctly overriding it with something else.

I suppose we could improve this check not to require the dependencies of the default implementation library to be resolved to check what package it belongs to. As you've alluded though, this is unlikely to be related to your original issue.

Dear Rudi, thanks for looking into that. If you could point me to a branch where the "check not to require the dependencies of the default implementation library to be resolved" is done, I'd be happy to try this out -- since to me it sounds it may be the same issue...

You can try the following patch as a start:

diff --git a/src/dune_rules/lib.ml b/src/dune_rules/lib.ml
index 5d24a3194..1fb1a6beb 100644
--- a/src/dune_rules/lib.ml
+++ b/src/dune_rules/lib.ml
@@ -1813,18 +1813,7 @@ module Compile = struct
     }
 
   let for_lib ~allow_overlaps db (t : lib) =
-    let requires =
-      (* This makes sure that the default implementation belongs to the same
-         package before we build the virtual library *)
-      let* () =
-        match t.default_implementation with
-        | None -> Resolve.Memo.return ()
-        | Some i ->
-          let+ (_ : lib) = Memo.Lazy.force i in
-          ()
-      in
-      Memo.return t.requires
-    in
+    let requires = Memo.return t.requires in
     let requires_link =
       let db = Option.some_if (not allow_overlaps) db in
       Memo.lazy_ (fun () ->

This removes the check altogether.