Thursday, May 27, 2010

A new user interface for LHC

The current user interface for LHC is pretty unwieldy - it requires you to invoke lhc twice: once to generate an external core file, and another to generate the executable with LHC itself.

There are a couple of problems with this:
  1. It requires -you- to keep track of the generated .hcr files, which is a PITA.
  2. It makes the test suite complicated, as we currently our own regression tool to handle things like #1. I would like to use Simon Michael's excellent shelltestrunner library, but the two-step compilation process would make the test files nastier than they would be, and it so we currently maintain our own regression tool.
  3. It made some of LHC's code very gross: we basically copied GHC's "Main.hs" file and stuck it in our source tree with some modifications, because we need to be able to accept all GHC options, even "insert arbitrary ghc option here" (for general usage, and cabal install support.) This was - as you could guess, incredibly fragile in terms of maintenance and fowards/backwards compatibility.
So now I've devised a new approach. We will instead run GHC in the background, twice: the first, we will call GHC to compile your code with your provided options, and we will generally always stick something like '--make -fext-core -c' onto your command line to generate external core. The second time, we will call GHC again, but instead we will call ghc with the '-M' command line flag. This flag calls GHC to generate a Makefile that describes the dependency information between modules. Running it on Tom Hawkin's atom project, you get something like this:

# DO NOT DELETE: Beginning of Haskell dependencies
Language/Atom/Expressions.o : Language/Atom/Expressions.hs
Language/Atom/Elaboration.o : Language/Atom/Elaboration.hs
Language/Atom/Elaboration.o : Language/Atom/Expressions.hi
Language/Atom/Analysis.o : Language/Atom/Analysis.hs
Language/Atom/Analysis.o : Language/Atom/Expressions.hi
Language/Atom/Analysis.o : Language/Atom/Elaboration.hi
Language/Atom/Scheduling.o : Language/Atom/Scheduling.hs
Language/Atom/Scheduling.o : Language/Atom/Elaboration.hi
Language/Atom/Scheduling.o : Language/Atom/Analysis.hi
Language/Atom/Language.o : Language/Atom/Language.hs
Language/Atom/Language.o : Language/Atom/Expressions.hi
Language/Atom/Language.o : Language/Atom/Elaboration.hi
Language/Atom/Language.o : Language/Atom/Elaboration.hi
Language/Atom/Common.o : Language/Atom/Common.hs
Language/Atom/Common.o : Language/Atom/Language.hi
Language/Atom/Code.o : Language/Atom/Code.hs
Language/Atom/Code.o : Language/Atom/Scheduling.hi
Language/Atom/Code.o : Language/Atom/Expressions.hi
Language/Atom/Code.o : Language/Atom/Elaboration.hi
Language/Atom/Code.o : Language/Atom/Analysis.hi
Language/Atom/Compile.o : Language/Atom/Compile.hs
Language/Atom/Compile.o : Language/Atom/Language.hi
Language/Atom/Compile.o : Language/Atom/Elaboration.hi
Language/Atom/Compile.o : Language/Atom/Scheduling.hi
Language/Atom/Compile.o : Language/Atom/Code.hi
Language/Atom.o : Language/Atom.hs
Language/Atom.o : Language/Atom/Language.hi
Language/Atom.o : Language/Atom/Common.hi
Language/Atom.o : Language/Atom/Compile.hi
Language/Atom.o : Language/Atom/Code.hi
# DO NOT DELETE: End of Haskell dependencies

This tells us the location of where all the generated object files are. GHC will put external core files next to these other object files (in all cases, as you cannot redirect the output location of external core files.) So we can just parse this simple Makefile, remove duplicates, and substitute '.o' files for '.hcr' files. LHC takes care of the rest.

This is of course in the event you want to compile an executable. If you want to compile a library, it's mostly the same, only when we parse the files we just store them for later.

But what about "obscure ghc option"? No fear! We'll just provide something like a --ghc-options flag which will get passed onto GHC's invocations. LHC can then have it's own, more general command line interface to control various options in the whole-program stages (on this note, Neil Mitchell's cmdargs library is amazing for this stuff!)

For default options to GHC, I think we should perhaps stick to the Haskell 2010 standard - that is, by default, LHC will run GHC with language options to enable compilation of compliant Haskell 2010 code without any OPTIONS_GHC or LANGUAGE pragmas. Optimization levels for GHC can be implied by LHC's supplied optimization level or explicitly via --ghc-options.

Comments are always welcome.

No comments:

Post a Comment