snowleopard / hadrian

Hadrian: a new build system for the Glasgow Haskell Compiler. Now merged into the GHC tree!

Home Page:https://gitlab.haskell.org/ghc/ghc/tree/master/hadrian

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement installation and binary/source distribution rules

snowleopard opened this issue · comments

There is no support for this in the build system yet. See install, sdist, binary-dist rules in makefiles.

Here is the todo list:

  • Installation
  • Source distribution (done, thanks @KaiHa!)
  • Binary distribution

@KaiHa Thank you for starting to work on this issue. Is it OK if I add you as a collaborator on the project? Otherwise I can't assign this issue to you.

commented

Is it OK if I add you as a collaborator on the project?

Yes, thats OK

@KaiHa Done! Please don't push directly to the repository, keep using pull requests, so that I can keep track of what's going on.

commented

Done! Please don't push directly to the repository

Thanks, hopefully this will prevent me from pushing into your repository by accident.

commented

@snowleopard if I need a function like wildCheckCase, can I add a build dependency to MissingH, or do you prefer a different approach?

@KaiHa MissingH is very much a library from a different era - most of the things it can do are done better by other things. As an example, Shake provides ?== which is used pervasively throughout the rest of Hadrian. If you need any help, give me the example of what you'd pass to wildCheckCase.

@KaiHa I agree with Neil, so far Shake's ?== was sufficient for most things. Where it wasn't I introduced simple functions like matchVersionedFilePath instead of adding new package dependencies.

commented

Thanks @snowleopard, @ndmitchell, ?== is exactly what I was looking for. @ndmitchell did you notice that in hackage the documentation of ?== is missing although it is in the source code.

commented

As @thomie suggested some files need to be generated by alex and happy before they can get included into the source distribution. Where should I put these generated files?

If I put them into the same place as in the legacy tarball it would be easy to spot differences between tarballs generated by make and hadrian. But hadrian will need extra rules if building from a source distribution. Should we add alex and happy as build dependencies for building from a source distribution and do not add these generated files at all? @snowleopard what do you think?

@KaiHa - I noticed last night when I wrote a documentation scanner for all my Haskell projects, and fixed it. Interesting how things often seem to be discovered independently and simultaneously! Thanks for pointing it out.

@KaiHa Hadrian already generates these files and puts them into _build directory. I guess if you need these files built for the sdist rule, you can simply need them.

Have a look at src/Oracles/ModuleFiles.hs and into the top post of #210 (updated definition).

Using contextFiles you can find all source files corresponding to modules of a given Context. These source files may include Alex/Happy sources, e.g. compiler/parser/Lexer.x.

Using haskellSources you can find all Haskell files corresponding to modules of a given Context, some of which are generated and therefore live in _build directory. It looks like this is what you are after in sdist build rule, so you could do something like:

...
srcs <- haskellSources context
putBuild "| Build generated source files..."
need srcs -- here Hadrian builds all generated Haskell files for you
putBuild "| Copy source files..."
for_ srcs $ \src -> do
    copyFile src ... -- copy to a source distribution
...

Does this help? Let me know if you have any other questions.

@KaiHa: the problem with adding alex and happy as build dependencies is that it breaks existing build scripts (Linux distributions etc). This might cause more repair work and annoyances than it saves on Hadrian implementation.

I've learned that even small changes such as placing the user's guide in share/doc/ghc-<version> instead of share/doc/ghc can cause busywork for others (fpco/ghc-rc-stackage@e96839c#commitcomment-16196551).

commented

@snowleopard can you have a look at 9495da2? It is a native implementation of sdist-ghc, but it has its flaws. I put some additional libraries into the tar-ball, because otherwise the build from the tar-ball fails (didn't investigate further).

Also I am not happy (no pun intended) with putting generated files into the tar-ball. The ghc ./aclocal.m4 script needs a patch if we leave the generated .hs files in the _build/ directory, otherwise ./configure will complain if alex or happy is missing. Furthermore I believe hadrian itself must learn how to deal with missing alex and happy.

@KaiHa I apologise for the long delay: I'm currently amidst business travel and can't find time for a proper review. In any case I suggest to raise a PR and we'll go from there. I should be able to respond with more substance during the weekend. Thank you!

commented

Don't worry about the delay @snowleopard. What would be the best way to get the paths to the produced program binaries and libraries for usage in the install rule?

What would be the best way to get the paths to the produced program binaries and libraries for usage in the install rule?

@KaiHa For binaries, see programPath :: Context -> Maybe FilePath from src/GHC.hs. This function is a big ad-hoc mess and I'm not happy about it at all, but I think you should use it for now.

For libraries, see pkgLibraryFile :: Context -> Action FilePath from src/Settings/Paths.hs.

Thanks to @KaiHa we now have working source distribution rules! There are a few minor issues, but I think we can tick this box in the todo list (see the top of the issue).

@snowleopard I am starting to look into this issue as well (finally got some free time yay!).

Now I will focus on install part. The first question is about --prefix=PREFIX option line argument. I guess I can define a prefix in Settings which finalizes its content like how Flavour does. Then I can access it in some new Rules.Install.

However, I don't know if I should pass it to ./configure, by modifying Ruls/Configure.hs as well (and ... I am a bit puzzled that we still depend on autotools like ./configure in some way. Is replacing autotools scripts also part of Hadrian/Shake project's long term vision?)

Now I will focus on install part. The first question is about --prefix=PREFIX option line argument. I guess I can define a prefix in Settings

@izgzhen Yes, I think we should add a few new configuration settings into hadrian/cfg/system.config.in, doing something similar to mk/install.mk.in:

prefix          = @prefix@
datarootdir = @datarootdir@
exec_prefix     = @exec_prefix@
bindir          = @bindir@
datadir         = @datadir@
libdir          = @libdir@
includedir      = @includedir@
mandir          = @mandir@

Then all these settings can be made available in Oracles.Config.Setting.

However, I don't know if I should pass it to ./configure, by modifying Ruls/Configure.hs as well

I think for now we can leave it to the user to provide additional arguments to Configure manually as described in https://github.com/snowleopard/hadrian/blob/master/doc/user-settings.md. However, at some point we really need to add a way to add extra arguments to any build tool from command line, e.g. --args-configure="prefix=foo". Let's keep these two issues separate to divide and conquer.

and ... I am a bit puzzled that we still depend on autotools like ./configure in some way. Is replacing autotools scripts also part of Hadrian/Shake project's long term vision?

I agree that this is not ideal, but that's a messy and difficult task. See this discussion: #227

I am starting to look into this issue as well (finally got some free time yay!).

@izgzhen By the way, are you aware of this: https://summer.haskell.org/ideas.html#hadrian-ghc? Looking at your website you are a student, so you could apply to work on Hadrian as part of Summer of Haskell 2017.

I agree that this is not ideal, but that's a messy and difficult task. See this discussion: #227

That is ... a very interesting thread. Thanks!

are you aware of this: https://summer.haskell.org/ideas.html#hadrian-ghc?

Yes, I am. Without this I will probably never know that shaking-up-ghc became Hadrian and it works!. I am still gathering the information for the proposal, and hopefully I will make up my mind to submit it to the committee :)

Yes, I am. Without this I will probably never know that shaking-up-ghc became Hadrian and it works!. I am still gathering the information for the proposal, and hopefully I will make up my mind to submit it to the committee :)

Great :-) Feel free to get in touch with me by email (first.last@ncl.ac.uk) if you'd like to discuss the project in general.

Regarding some variables used in ghc.mk installation part:

On my machine, I can get from the make-based GHC that INSTALL_DIR="/usr/local/bin/ginstall -c -m 755 -d". This is generated from AC_PROG_INSTALL macro in configure.ac I guess.

I don't know if there is a way to extract this info out, or maybe there is some cleaner Shake based solution, like a Install builder maybe?

Have a look at this oracle:

https://github.com/snowleopard/hadrian/blob/master/src/Oracles/Path.hs#L87-L91

It is used to cache the invokation of cmd ["cygpath", "-m", path] and is used to compute absolute paths on Windows. I think you can use a similar approach to evaluate expressions like INSTALL_DIR.

Is this what you need?

Do you mean evaluate it against some configure script? What is the cygpath in our case?

I used make show! VALUE=INSTALL_DIR to get this information in a regular make-based system.

Do you mean evaluate it against some configure script?

No, I mean you can simply call cmd ["/usr/local/bin/ginstall -c -m 755 -d"] and make the result available in Hadrian via the oracle mechanism. For example, building on the code I linked above:

-- Use oracle in Action monad
...
installPath <- askOracle InstallPath
...

-- Register oracle
void $ addOracle $ \InstallPath -> do
    Stdout out <- quietly $ cmd ["/usr/local/bin/ginstall -c -m 755 -d"]
    let path = unifyPath out
    putLoud $ "Install path: " ++ path
    return path

What is the cygpath in our case?

It's a command available in MSYS to compute abosolute paths on Windows.

Oh, I see. My main point is that /usr/local/bin/ginstall -c -m 755 -d seems to be too platform specific. The old system knows what INSTALL_DIR is through ./configure ....

The old system knows what INSTALL_DIR is through ./configure ....

I see. So, configure produces /usr/local/bin/ginstall -c -m 755 -d, which we need to be able to evaluate in Hadrian. Perhaps, we should use both configure to obtain the expression, and then have a generic expression evaluation oracle that can evaluate and cache any such expression.

The left TODOs of implementing install after a workable version is delivered in #312:

  • Automatic dependency generation in installPackages (#342)
  • Test if it works on Windows platform (#345)
  • Write README for using it
  • Solve the build subpath HACK && Drop dependency on “ghc-cabal copy” (#318, #327, #18)
  • Consider if we need track dependency of installed artifacts (#344)
  • Install docs (blocked on #324)

@izgzhen Thanks! I'll do the Windows testing.

By the way, I've just realised I could not assign this task to you since you were not listed as a contributor, so I sent you an invite to join the team. I think this will let you push to the repo directly, but please keep using PRs to avoid clashes.

I think this will let you push to the repo directly, but please keep using PRs to avoid clashes.

Sure, thanks.

What is the status of bindist support? This would be great to have as it would allow us to move the new CI infrastructure over to Hadrian.

@izgzhen Is bindist rule on the to-do list of your Summer of Haskell project?

@izgzhen Is bindist rule on the to-do list of your Summer of Haskell project?

It is not, but I plan to take over it after summer. But feel free to do it yourself or assign to others :)

@izgzhen Thanks! I've unassigned you for now, as the installation rule is almost complete (except for #345). If noone signs up for it I'll try to take care of bindist in August.

What is the status of bindist support?

@bgamari I didn't have time to look into this yet.

It would be more productive is someone familiar with what bindist should do gives this a try, so I've been waiting for a volunteer :) But if noone shows up I'll self-assign.

I am looking at this. As far as I can tell there are two pieces to this (in the unix case at least),

  1. Building the bindist tarball
  2. Implementing the installation logic

In principle (1) is straightforward, being very similar to the source distribution case.

In the case of (2), however, there are a few possible directions to take,

  1. Continue using the current build system for its make install rule
  2. Build a shell script or Makefile specifically for installation
  3. Build a Haskell installation executable

To me it seems like (3) is a bit more complexity than we need and will bloat binary distribution sizes unnecessarily. I suspect (2) is the right choice, but I'm curious to hear what your thoughts are, @snowleopard .

@bgamari I agree. We can rely on option (1) while Make is still in the tree, and (2) will be a good long-term solution. We don't want to build Hadrian or any other executable just to make a binary installation, a script should be sufficient.

Ideally we would be able to reuse Hadrian's install logic while implementing the bindist logic. It would be a shame to have to duplicate this logic. For instance, one might write down a small DSL for installation actions which could be interpreted by Hadrian directly to implement the current hadrian install rule, or interpreted to produce a bindist install script.

Hi @snowleopard @izgzhen.
I would like to help in completing this task. Can someone please give me a headstart on where to begin. Thanks.

commented

Regarding bindists: I wrote a preliminary binary distribution rule in #445 that I will soon adapt and submit in a pull request, once #531 is merged.

@chitrak7 Thank you! It looks like there is already some progress with this. I suggest we come back to this issue when #531 and #445 are merged.

@chitrak7 also, it looks like the install rules is temporarily disabled in #531, so it would be nice if you can talk with @alpmestan about fixing that part as well :)

I don’t think we want to actually have “install” in hadrian. If you ship the binary distribution you want something portable that doesn’t require to build hadrian (or ship a fully static hadrian binary).

The linked PR puts everything in stage1/{bin,lib} and building the binary distribution ends up being a simple zip/tar archive of those two folders. By being relocatable for the major platforms you can just unpack it and move it where you like. For convenience a short configure script is supplied, such that configure —prefix and make install just work.

@angerman I see, sounds good!

commented

@chitrak7 We can talk (IRC, email?) about what comes after #531 (the binary distribution rule, wrapper generation, etc). What I have is quite simple and doesn't cover all cases, as I take advantage of the relocatability that comes with #531.

Maybe I should put together a patch on top of #531 that adds the binary distribution generation logic, and we can see where you can take it from there (in particular, you might want to look into generating wrapper scripts only when they're required, as well as maybe embedding some more content in the bindists -- I currently only pack bin, lib, docs and a couple of files necessary for running make install on the system where you want to install the bindist, IIRC).

Hi @snowleopard @alpmestan @izgzhen
PR #531 currently has implemented an elementary version of binary distribution. The current version ships contents of stage1 compilers “bin” and “libs” and some other files sufficient for installation. Although this is sufficient for ghc to work, few files may be missing. If I am not wrong, the future work in this issue can be summarised as:

  1. Current “bindist” rule needs “phony” rules. We have to instead use the original files and executables that we have to ship.
  2. Add an additional rule to figure out the installation directory on its own depending on the platform on which it is being installed.
  3. Compare with the binary distribution supplied by ghc to check for files/executables that are still not supplied.
  4. Currently, most of platforms do not need wrapper scripts anymore. I will implement rules to figure out the need of generating wrapper files or not depending upon the platform.
  5. Add more features to the makefile, to match those which are currently supplied with the make-based ghc binary distribution.
commented

Just a small comment: the implementation is not in #531 but will come in a separate PR. For now it's just in a branch (based on #531): alpmestan/hadrian@bye-ghc-cabal...alpmestan:alp/bindist

commented

Since #558 we have a binary distribution rule.

@alpmestan Yes, I think we should close this issue as soon as we fix Hadrian and test the binary distribution rule on all platforms.

commented

Yes, I didn't mean to comment to say "let's close this", just a status update. I also think it would perhaps make sense to leave it open until the bindist rule and the accompanying Makefile become a little more feature complete (compared to the current bindist rules in the make build system).

@alpmestan @snowleopard Can we close this issue now? Or is there still some work required?

commented

@chitrak7 Do we want to be a little smarter in the bindist code as well, like we're planning to do in the testsuite? And just need whatever ways and packages and what not are dictated by the --flavour that we pass to hadrian, instead of assuming just vanilla (like we do now, in Rules.BinaryDist.pkgTarget, for libraries) or any other arbitrary choice like that?

Maybe @bgamari has an opinion on this, as he's our main user for the bindist rule I think. :-)

@alpmestan I think binary distributions do not allow this level of customization. Having said that, I just checked with ghc-8.4.2 binary and saw that we need dynamic libraries too, as well as more rts ways. So maybe we build the binary distribution adding all these ways. We can make a bindist flavour and build that, but need not provide customization of flags.

commented

Yes we need vanilla, profiling and dynamic at the very least I think.

I think we need to solve #626 before we can address this.

We now have Rules.Bindist, so I think we can close this issue and open more specific ones if needed.