Skip to content
This repository has been archived by the owner on Aug 2, 2020. It is now read-only.

Precompile interface file option (better paralellize build) #174

Closed
kgardas opened this issue Jan 15, 2016 · 57 comments
Closed

Precompile interface file option (better paralellize build) #174

kgardas opened this issue Jan 15, 2016 · 57 comments

Comments

@kgardas
Copy link
Collaborator

kgardas commented Jan 15, 2016

I've done small experiment on 6 core xeon comparing shake -j12 and gmake -j12 build on Solaris 11.2 on the same code base.
Shake:

real    34m11.983s
user    138m41.855s
sys     40m42.177s

Gmake:

real    39m7.557s
user    167m17.491s
sys     45m34.928s

I've disabled haddock in gmake build, but I'm still not sure both gmake and shake build things in equal way, I'm afraid gmake may build more so basically performance of builds is let say the same.
In https://mail.haskell.org/pipermail/ghc-devs/2015-March/008474.html I've provided some data about performance of highly parallel build. As you can see performance was quite bad at that time but it looks like shake based build is around the same perf. My idea from following builds rolling on the console is that sometimes build just waits on one or two files for which it needs to have interface file generated since this is dependency for a set of other files which need to wait in a queue instead of being compiled quickly. So my idea is if we somehow are able to divide actual file compilation and generation of the file's interface file, then the performance of the parallel build may be much higher. It looks like GHC supports this option with -fno-code -fwrite-interface command-line options. Now the question is how hard would be to add that capability to shake-based build? Thanks a lot for consideration!

@snowleopard
Copy link
Owner

First of all, let me list the current limitations to help you compare things in a fair way:

  • We only build vanilla way at the moment.
  • Split objects are disabled by default to save build time (this can be changed in Settings/User.hs).
  • Documentation is broken inplace/bin/haddock expects threaded rts (HSrts_thr) #98.
  • Some compilation flags may differ leading to different compilation times.

So my idea is if we somehow are able to divide actual file compilation and generation of the file's interface file, then the performance of the parallel build may be much higher. It looks like GHC supports this option with -fno-code -fwrite-interface command-line options.

Interesting idea, I never thought about this. I think we can add support to it in the new build system without much difficulty. At the moment we have the following rule:

    matchBuildResult buildPath "hi" ?> \hi ->
        need [ hi -<.> osuf (detectWay hi) ]

Basically it says: if you'd like to get a *.hi file, go build the corresponding *.o file. We could instead have a separate build rule here with -fno-code -fwrite-interface which produces only *.hi file. Is this what you have in mind? I think I like this, sounds like a simple change with a big benefit potentially.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 15, 2016

@snowleopard yes, that's exactly what I had in mind. The only problem here is that probably for interface files to stay intact or applicable for *.o compilation you need to use exactly the same compiler params like for later *.o compilation. -- my wild guess. Anyway, if there is a possibility to build just necessary hi file that easily, then I'm ready to give it a try on my heavily threaded slow sparc box next week and see if with 32 thread I get better performance. Thanks!

@snowleopard
Copy link
Owner

@kgardas I can enforce absolutely the same build flags for *.hi as for *.o (apart from -fno-code -fwrite-interface) if that's what you mean.

One question: do I have to do -o file.hi instead of -o file.o?

@snowleopard
Copy link
Owner

Looks like I don't need to use -o at all when producing *.hi files only.

@snowleopard
Copy link
Owner

I've added initial support for this feature. To activate it set compileInterfaceFilesSeparately = True in Settings/User.hs. Let me know how it works for you.

Note, I sometimes experience the following errors that I cannot explain:

ghc [...] -fno-code -fwrite-interface -I.build/stage0/compiler
Exit code: 1
Stderr:
Unable to open .build/stage0/compiler/build\Vectorise\Utils\Poly.o

As you can see this happens in the -fno-code -fwrite-interface mode. I have no idea why GHC attempts to access any *.o files in this mode.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard First of all thanks a lot for your fast reaction to this RFE. I'm testing the feature and twice I've seen this error:

Error when running Shake build system:
* utils/hpc/stage1/build/tmp/hpc-bin
* .build/stage1/utils/hpc/build/HpcOverlay.o
* .build/stage1/utils/hpc/build/HpcParser.hi
user error (Development.Shake.cmd, system command failed
Command: inplace/bin/ghc-stage1 -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db -package-id array-0.5.1.0 -package-id base-4.9.0.0 -package-id containers-0.5.7.1 -package-id directory-1.2.5.0 -package-id filepath-1.4.1.0 -package-id hpc-0.6.0.3 -i -i.build/stage1/utils/hpc/build -i.build/stage1/utils/hpc/build/autogen -iutils/hpc -I.build/stage1/utils/hpc/build -I.build/stage1/utils/hpc/build/autogen -I/tmp/shake-test/libraries/directory/include -I/tmp/shake-test/libraries/unix/include -I/tmp/shake-test/libraries/time/lib/include -I/tmp/shake-test/libraries/containers/include -I/tmp/shake-test/libraries/bytestring/include -I/tmp/shake-test/libraries/base/include -I/usr/include/gmp/ -I/tmp/shake-test/libraries/integer-gmp/include -I/tmp/shake-test/.build/stage1/rts/build -I/tmp/shake-test/includes -I/tmp/shake-test/includes/dist-derivedconstants/header -optP-include -optP.build/stage1/utils/hpc/build/autogen/cabal_macros.h -XHaskell2010 -odir .build/stage1/utils/hpc/build -hidir .build/stage1/utils/hpc/build -stubdir .build/stage1/utils/hpc/build -rtsopts -H32m -O2 -c .build/stage1/utils/hpc/build/HpcParser.hs -fno-code -fwrite-interface
Exit code: 1
Stderr:
ghc-stage1: panic! (the 'impossible' happened)
  (GHC version 8.1.20160114 for i386-unknown-solaris2):
        lookupVers2 GHC.Stack.Types CallStack

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

)

perhaps we stretch GHC too much? Also what little bit worries me is sometimes messages compilation IS NOT required while doing fresh build. I'm curious if we do not confuse GHC when it sees hi file and then refuses to compile o file?
What's also interesting now, is that GMP library (in-tree) is always built:

/---------------------------------------------------------------\
| Copy file                                                     |
|      input: libraries/integer-gmp/gmp/tarball/gmp-5.0.4.patch |
|  => output: .build/stage0/gmp/gmp-5.0.4.patch                 |
\---------------------------------------------------------------/
| Apply patch .build/stage0/gmp/gmp-5.0.4.patch
| Run configure in .build/stage0/gmp/gmp-5.0.3...
| Run /usr/bin/gmake (MAKEFLAGS=) in .build/stage0/gmp/gmp-5.0.3...

although I've correctly configured system gmp...

@thomie
Copy link

thomie commented Jan 16, 2016

@kgardas: see https://ghc.haskell.org/trac/ghc/ticket/11331 for that panic. They are working on it.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard w.r.t. gmp issue, after doing proper gmake clean; rm -rf .build inplace; configure I see gmp is detected well so this was perhaps some error caused by merge and not running cleanup/configure properly. I'll keep an eye on this anyway.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@thomie thanks! This really helps to know this.

@snowleopard
Copy link
Owner

I also see failures of GHC sometimes, especially when parallelism is high. Next time I see one, I'll check whether it is the same error.

@kgardas Other than spurious GMP issue and GHC panicking, was the build successful? Note I've just pushed a fix to suppress another lint error related to GMP: f63e9db.

@snowleopard
Copy link
Owner

Also what little bit worries me is sometimes messages compilation IS NOT required while doing fresh build. I'm curious if we do not confuse GHC when it sees hi file and then refuses to compile o file?

Yes, this is strange. Are you sure .build directory was removed before the build? At the moment we do not have a proper clean target: #131.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard I'll double-check and will let you know. This was on -j12 build. Now I'm trying simple -j1

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard complete side note, I'm not sure if this is for another issue or not. I usually hit following error when rm -rf .build inplace and when starting build without -B option:

user error (Development.Shake.cmd, system command failed
Command: /opt/ghc-7.10.1-i386/bin/ghc -hisuf hi -osuf o -hcsuf hc -static -no-user-package-db -package-db /tmp/shake-test/.build/stage0/bootstrapping.conf -i -i.build/stage0/utils/hp2ps/build -i.build/stage0/utils/hp2ps/build/autogen -I.build/stage0/utils/hp2ps/build -I.build/stage0/utils/hp2ps/build/autogen -odir .build/stage0/utils/hp2ps/build -hidir .build/stage0/utils/hp2ps/build -stubdir .build/stage0/utils/hp2ps/build -rtsopts -H32m -O -no-auto-link-packages -optl-lm -optl-lgmp .build/stage0/utils/hp2ps/build/AreaBelow.o .build/stage0/utils/hp2ps/build/Curves.o .build/stage0/utils/hp2ps/build/Error.o .build/stage0/utils/hp2ps/build/Main.o .build/stage0/utils/hp2ps/build/Reorder.o .build/stage0/utils/hp2ps/build/TopTwenty.o .build/stage0/utils/hp2ps/build/AuxFile.o .build/stage0/utils/hp2ps/build/Deviation.o .build/stage0/utils/hp2ps/build/HpFile.o .build/stage0/utils/hp2ps/build/Marks.o .build/stage0/utils/hp2ps/build/Scale.o .build/stage0/utils/hp2ps/build/TraceElement.o .build/stage0/utils/hp2ps/build/Axes.o .build/stage0/utils/hp2ps/build/Dimensions.o .build/stage0/utils/hp2ps/build/Key.o .build/stage0/utils/hp2ps/build/PsFile.o .build/stage0/utils/hp2ps/build/Shade.o .build/stage0/utils/hp2ps/build/Utilities.o -o inplace/bin/hp2ps -no-hs-main
Exit code: 1
Stderr:
ghc: can't find a package database at /tmp/shake-test/.build/stage0/bootstrapping.conf
)

this is very reproducible. Perhaps also pointing to a need to have proper shake clean way...

@snowleopard
Copy link
Owner

@kgardas There is inplace/lib/package.conf.d which you delete in this way. I presume you have to restart the build after that.

@snowleopard
Copy link
Owner

@kgardas I've created a separate issue for inplace/lib/package.conf.d problem: #176.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard it looks like lookupVers2 GHC panic is a showstopper for me now. Anyway, on completely clean build which was done as

rm -rf .build inplace
gmake clean
./configure <params>
./shake-build/build.sh -B

so -j1 and with compileInterfaceFilesSeparately = True -- I'm able to see 109 compilation IS NOT required messages before failing on GHC panic.

$ grep "IS NOT required" sb-2.0.log |wc -l
     109

the panic hits me while compiling .build/stage1/utils/hpc/build/HpcParser.hs
Thanks for #176

@snowleopard
Copy link
Owner

@kgardas I'm afraid I don't have any insight into why we see compilation IS NOT required. Can you try to investigate why this happens? It makes sense not to use any -j flags at all during the investigation.

@snowleopard
Copy link
Owner

The issue with inplace/lib/package.conf.d (#176) should now be fixed.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 16, 2016

@snowleopard few cases I've investigated and the behavioural pattern is still the same:
hs is compiled into o (side effect hi is generated too)
hs is compiled into hi. (message is printed)
I'm not sure why hs-> hi compilation is run at all when it's not needed. Bug anywhere?
E.g.

/-------------------------------------------------------------------------------
---------\
| Run Ghc Stage0 (package = Cabal)                                              
         |
|      input: libraries/Cabal/Cabal/Distribution/Simple/GHC/IPI642.hs           
         |
|  => output: .build/stage0/libraries/Cabal/Cabal/build/Distribution/Simple/GHC/
IPI642.o |
\-------------------------------------------------------------------------------
---------/
/-------------------------------------------------------------------------------
----------\
| Run Ghc Stage0 (package = Cabal)                                              
          |
|      input: libraries/Cabal/Cabal/Distribution/Simple/GHC/IPI642.hs           
          |
|  => output: .build/stage0/libraries/Cabal/Cabal/build/Distribution/Simple/GHC/
IPI642.hi |
\-------------------------------------------------------------------------------
----------/
compilation IS NOT required

@snowleopard
Copy link
Owner

@kgardas Ah, I see! Now I understand what's going on.

I'm not sure why hs-> hi compilation is run at all when it's not needed.

I think this is because the *.o rule does not actually tell Shake that it also produces *.hi files.

We probably need to add (one of?) the following steps:

  • In the beginning of *.o rule add a need for the *.hi file, forcing the *.hi rule to fire first.
  • Tell Shake that the *.o rule also produces *.hi files and specify which rule has higher priority, so that when a *.hi file is needed Shake knows which rule to run. I think we need to give a higher priority to the *.hi rule.

Maybe we only need to do the second step.

I'll do a quick experiment and will commit a fix if it works for me.

@snowleopard
Copy link
Owner

I confirm that I'm also hit by the lookupVers2 GHC panic.

This also happens when generating interface for HpcParser.hs, so I temporary disabled the optimisation for this particular file.

snowleopard added a commit that referenced this issue Jan 17, 2016
@snowleopard
Copy link
Owner

I've committed some further work on this.

The build has finished now, improving average parallelism from 2.61 to 3.00 with -j4. However, I got a lot of lint errors.

@thomie
Copy link

thomie commented Jan 17, 2016

@ezyang might have something to say about this ticket.

@snowleopard
Copy link
Owner

Just to clarify the reason behind the lint errors in the current implementation: we first create *.hi files via -fno-code -fwrite-interface, but then we write them again when producing the corresponding *.o file. Shake thinks this is dodgy and gives a lint error.

Is there a way to disable writing of interface files in the normal mode of operation? While searching I came across -fno-write-interface, but not sure it's what we need.

@ggreif
Copy link
Contributor

ggreif commented Jan 17, 2016

Shot in the dark: you can use -hisuf to divert them. What happens when
specifying "-hisuf /" ?

Em domingo, 17 de janeiro de 2016, Andrey Mokhov [email protected]
escreveu:

Just to clarify the reason behind the lint errors in the current
implementation: we first create *.hi files via -fno-code -fwrite-interface,
but then we write them again when producing the corresponding *.o file.
Is there a way to disable writing of interface files in the normal mode of
operation? While searching I came across -fno-write-interface, but not
sure it's what we need.


Reply to this email directly or view it on GitHub
#174 (comment)
.

@ndmitchell
Copy link
Collaborator

@kgardas ok, willing to believe in certain situations code gen could dominate. Could you measure on one module you think makes a difference? A useful number to guide us.

This technique trades additional cost for more parallelism, but I always want parallel for free where possible. I wonder if GHC could be persuaded to generate the C plus the command to compile, then Shake could do both pieces separately.

@ezyang
Copy link

ezyang commented Jan 17, 2016

Obviously, typechecking can't be parallelized, unless you actually and go and modify GHC a bit. The hope here is to parallelize optimization and code generation.

Unfortunately, the results you get optimizing here are not going to be as good as doing it the normal way. When you write out an interface for -fno-code -fwrite-interface, we write out the contents of the interface PRIOR to doing any optimization. Among other things, this means that unfoldings are not included. Put differently, your scheme is analogous (actually, it performs a bit worse, because you're typechecking the internal contents of functions too; but this means it works even if you're missing type signatures) to having written hs-boot files for every module in your project, and changing all your imports to {-# SOURCE #-} (existing cycles notwithstanding! You need two levels of hs-boot to deal with that.) Everyone knows that you can't inline an implementation when compiling against an hs-boot... because there is no implementation to inline.

In principle, such a system may still be useful for incremental recompilation, because avoiding unfoldings means that things recompile less when you make modifications; it also means that you can parallelize optimization and code generation (so this IS a little different from compiling with -O0; the point being that even if you can't inline across module boundaries, you still want to optimize within a module. It's unclear how useful this actually would be, because cross-module inlining matters a lot for optimization.)

In any case, the current implementation is a bit dodgy, because GHC does not guarantee that an -fno-code -fwrite-interface interface will let you successfully build code that can use the normal -O2 compilation; we strive for determinism but GHC is not fully deterministic, and in any case important names may change between the two compilations. You'll need more assistance from GHC.

If you still want to implement this, here's how GHC could be adjusted to make this possible:

  1. Resurrect my fat interface patch https://ghc.haskell.org/trac/ghc/ticket/10871 which lets you divide the compilation of a Haskell file into two steps: compilation to a "buildable" interface file, and then actual optimization and code generation of the buildable interface file. Type-checking is done with respect fat interface files, whereas optimization is done with respect to "real" interface files. The benefit is that you get to typecheck things more quickly (since the partial results are popping out more quickly); the downside is that there's more serializing/deserializing (which could erase your performance gains) and the actual optimization/code generation will not perform/parallelize any better.
  2. Add a new mode to GHC which says "optimize inside a module, but maintain AS MUCH API stability at each module boundary as possible." So, in effect, something like -O0 but which still optimizes inside a module. Doing this will immediately improve recompilation times. Then, an interface file can be written out immediately after type-checking, and all you need is to somehow arrange for Shake to realize that intermediate results for hi files have come out (meaning that you can start type-checking other things.)

@ndmitchell
Copy link
Collaborator

Brain wave! I think @ezyang's comments mean this isn't going to really give us what we were hoping for. But, there is an alternative which might be faster in all circumstances, never noticeably slower than a normal compile, and therefore could become the default. My scheme:

  • Compile as normal, but configure the C compiler to be at a different location (either by tweaking the PATH or with flags).
  • Substitute in a fake C compiler that just records the command line, and writes out an empty .o file, but does nothing more.
  • Then have a second rule that takes the C compiler command line and runs it.

You get the parallelism of compiling and codegen separately, without having to do anything funky to GHC.

There's the assumption that GHC won't break if we replace the .o file after (I think that's fine. Compilation checking is the only possible niggle, and I think compilation checking is timestamp only.) and that GHC doesn't use the .o file in that compile (I don't think that's true for stub files, but I think it's true most of the time).

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ezyang thanks a lot for the indepth explanation. I was afraid that we may hit the wall here by exploring paths not directly tested/supported by GHC. From your description it looks like -fwrite-interface -fno-code combination is simply GHC buggy as it write interface too early, probably on different place than usual interface write is done during normal compilation. Your reference to GHC ticket leads to quite a lot of other information w.r.t. backpack etc which is quite hard to distile.
Anyway, the conclusion for me from this is that GHC's interface is kind of API unstable based on actual optimization level too? If this is true, then hic sunt dracones and this way is probably not pass-through (sigh non-english speaker here). So bad assumption done on my side that "interface" file is API stable...

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ndmitchell interesting idea, but what about different thing? Let's test (if it does already) or change GHC to write hi files as quickly as possible (but reliably like @ezyang pointed out), it should be done kind of transactional so write hi to hi_promise and once done rename hi_promise to hi (or kind of that) and then continue with compilation to asm/llvm/C as usual. Then hack shake to not wait for ghc compiling hs -> hi,o but check for hi only and if it's there it can fire out new compilations of dependent modules. Am I clear on this? IMHO this is what GNU make is not able to do (or current build system) and this is why it sucks on parallel compilation...

@ndmitchell
Copy link
Collaborator

@kgardas - I think my hacking the C file is just an implementation of promises that would be simple and quite robust. Having a Shake rule that starts, produces something, but then continues and produces more without the possibility of pausing it, doesn't really fit with Shake. Polling for the .hi file or using notify techniques is a pain. I think the benefit over my promises technique would be small (you save computing the ASM, but little else), and the cost high (polling, watching, Shake side implementation).

@ndmitchell
Copy link
Collaborator

I should say, while this "doesn't really fit" with Shake, I have no doubt it is somehow possible, so if that becomes the only reason holding this back, I'll have a think.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ndmitchell your hacking on C compiler is interesting but I fear the build system for unregisterised and registerised builds will be different which is something I would try to avoid. Or well, perhaps I do not understand how hard would be to put your idea into GHC's shake build machinery...

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ndmitchell yes, my idea is about polling for kind of intermediate result since other modules compilations depend on this intermediate result and not on the actual result of compilation. i.e. hi files versus o files. Surely final linkage depends on o files, but this is not important in situation where you E.g. see build process compiles DynFlags.hs and you wait ~30 minutes for C compiler to finish compiling generated DynFlags C files while hi is already on the drive and your machine is idle since only one thread from 32 available is working. ;-) (DynFlags is classical example which I need to break here).

@ndmitchell
Copy link
Collaborator

Hmm, true, I hadn't thought of registered vs unregistered - that may be the place where GHC mangles the .o file afterwards, which is mostly fatal to my technique. That said, maybe combining our techniques gives something better - when GHC calls out to the C compiler we know the .hi file is done, so we could use that as the trigger to allow everything else to continue.

@ndmitchell
Copy link
Collaborator

Is it really 30 mins to build DynFlags C code? That seems like a bug - as though GHC is tickling some bad complexity in the C compiler. Is that using gcc, or some ancient system compiler?

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ndmitchell next week I will measure DynFlags for you, but please keep in mind this is UltraSPARC T1 so basically 1GHz single-issue in-order 8 core with 4 threads per core machine. Old and tuned for highly parallel work indeed. Also if you are curious just test --enable-unregisterised on Linux and see how build times differ. NCG is really a kind of speed here... Unfortunately I still do have some bugs to fix in SPARC NCG before the build may be sped up this way...

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ndmitchell small correction to your fake C compiler idea. The problem to solve is that at the time real C compiler is invoked original C source is long time gone since this is GHC's temporary file which is deleted on GHC process exit (more or less). So your idea involves also copying C code to some other temporary location to prevent its deletion and compiling this then.

@ezyang
Copy link

ezyang commented Jan 17, 2016

Let me add, it's relatively straightforward using the GHC API to bail out before running the compiler/assembler. But I really don't think it's C's fault. For example, on DynFlags, I'm pretty sure the bottleneck is due to instance deriving on the giant data structure.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@ezyang by your last note about DynFlags, do you suggest this also means that DynFlags.hi will not be generated that soon as I hope so? Well, will need to test this in real for sure...

@ezyang
Copy link

ezyang commented Jan 17, 2016

I don't know. Here's the GHC bug tracking: https://ghc.haskell.org/trac/ghc/ticket/7258

Whether or not it generates quickly enough depends on whether or not the type checker is slow (if so, nothing will help), or if the optimizer is slow (if so, -O0 or -fno-code -fwrite-interface will help), or if the code generator is slow (if so, eagerly writing the interface will help.)

@snowleopard
Copy link
Owner

Thanks all for your input. Very interesting discussion and I hope we'll eventually find a way forward.

In the meanwhile, shall I remove my experimental implementation from the codebase? It looks like it won't bring us to a right solution. Or will anyone still like to play with it? The compileInterfaceFilesSeparately flag is off by default, so we can keep it (at the cost of less readable code).

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 17, 2016

@snowleopard could you be so kind and keep it at least for benchmarking purposes? IMHO if it's done correctly then it may represent times of parallel build we can get either with shake polling for hi file as intermediate or with fixed ghc emitting correct hi file...At least good for the reference, isn't it? I think @ndmitchell idea of fake C compiler may run only a little bit slower due to a wait for actual C code to be generated...

@snowleopard
Copy link
Owner

@kgardas OK, let's keep it for now. Let us know if you get any interesting benchmarking results.

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 20, 2016

Hi,
first of all @ezyang was right about the fact that current -fno-code -fwrite-interface machinery is not working as it should and that it generates wrong interface file(s). I've done promised testing on sparc T1 and the results are here:
DynFlags.o compilation (of GHC 8.0.1 RC1 done by GHC 7.10.1) takes ~30 minutes.
DynFlags.hi compilation done by -fno-code -fwrite-interface takes ~20 seconds, resulting file is 88kb long
DynFlags.hi compilation from DynFlags.o compilation, actually I'm doing DynFlags.o with -v to see messages about writeBinIface took ~13 minutes. The result is obviously correct hi file
After hi files is saved, C code gen took ~6 minutes and GNU C took ~11 minutes
Now, the question is how hard is it to fix GHC's -fno-code -fwrite-interface to behave correctly...

@kgardas
Copy link
Collaborator Author

kgardas commented Jan 20, 2016

A note about correct and incorrect hi files for DynFlags:

$ ls -la compiler/stage1/build/DynFlags.hi*
-rw-r--r--   1 karel    karel    1691874 Jan 20 10:12 compiler/stage1/build/DynFlags.hi
-rw-r--r--   1 karel    karel      88797 Jan 20 09:39 compiler/stage1/build/DynFlags.hi-no-code-write-interface

so hi-no-foce-write-interface is generated by -fno-code -fwrite-interface which took those 20 seconds. The other one is from ordinary compilation and took ~13 minutes to get to its write.

@olsner
Copy link

olsner commented Jan 20, 2016

I've experimented some with this too, though in the makefile system (most of it messing about just to get the rules to run properly for .hi and .o files). The current state of that mess is at https://github.com/olsner/ghc/commits/separate_hi_2

That included some code to write the "full" interface with -fno-code, attempting to get the same interface generated as when running without -fno-code: olsner/ghc@b762186

An issue with this is that the interface sometimes changes when building the .o file, and then you end up with e.g. DynFlags.o compiled against a new interface and dependent modules built against an old mismatching interface, and the dependent modules sometimes end up referencing symbols that aren't there or got different names in the actual object file...

To detect that, I added some code to panic instead of updating the interface when subsequently compiling the .o file:
olsner/ghc@3741e25
I had some debugging code to print the old/new interface too, and iirc the diffs seemed mostly to boil down to non-determinism in ghc, like different unique names.
Fat interfaces or properly dependable determinism seem like what we'd need to make it work.

@snowleopard
Copy link
Owner

There hasn't been much activity here and it looks like the approach we were exploring got stuck anyway, so I'm inclined to close this issue for now. I will also remove the associated experimental code, as it often makes it more difficult for me to work with the build system. We can always bring it back if need be.

@olsner
Copy link

olsner commented Sep 6, 2016

https://ghc.haskell.org/trac/ghc/ticket/4012 seems to have made good progress since my last experiment, so it might be worthwhile to make an attempt at picking this up again. It would probably also catch #216 as a bonus, by having the .hi files as actual targets rather than by-products of .o compiles.

@snowleopard
Copy link
Owner

@olsner Note: issue #216 can be solved by using Shake's multiple-output rules, which I planed to implement soon (currently progress on Hadrian is slow due to various other commitments). This is not too difficult, but requires some refactoring of Hadrian. Still much simpler, I think, than solving this issue.

However, if you wish to come back to this issue and optimise the build that would be great! If you'd like to do so, I'd suggest to open a new issue with an outline of the proposed approach, because this thread got a bit too long to make sense of all the discussions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants