As you might know, Coq Platform has a parallel build (call opam install for all packages at once) and a sequential build (call opam install for one package after another). The sequential build obviously depends on the order. I took some care that packages are listed in dependency order, although this is not perfect. If I restructure the grouping, this is likely to break widely.
I wonder if there is a better way. The ocaml builds are probably not that memory hungry as the Coq builds, so one would have to take care that the number of coqc calls is limited. Does someone have an idea how one could achieve this?
The sequential build is mostly required on machines with low memory. The installer script tells the user:
With 4 GB+1GB swap a sequential package build with 2 make jobs is recommended.
With less RAM, you might have to remove failing packages, e.g. VST.
Doesn't opam work out the dependencies when you try to build all at once? Perhaps you can use this info to build one at a time. Also can't you just do a parallel build and limit it to 1 job?
From the docs, it seems that -j1 should indeed be a sequential build: https://opam.ocaml.org/doc/FAQ.html#What-does-the-jobs-option-do-It-doesn-39-t-seem-to-enable-parallel-builds
OTOH, if all packages pass it to make by calling make "-j%{jobs}%"
, it seems -jN
will use up to N * N
jobs? Do Coq packages do this?
but that’s a different question, and -j1
should solve the main problem.
Yes, I can do -j1, but it gets really slow then. Most machines have at least 4 cores (or threads) but some don't have enough RAM to run 4 coqc. The issue with opam is, as you say, that the argument to -j is in essence squared, because on the one hand it is the number of parallel jobs opam runs, but then many opam packages pass it as -j argument to sub make jobs. In essence I would need a -j sqrt(2).
With -j1 the build time of the full platform might be 6 hours on some machines. With -j2 sequential it usually not worse than 3. With -j2 parallel build, it doesn't run through with 4 GB RAM + 1 GB swap.
Half of me is thinking about make’s jobserver protocol (but I don’t think that’s easy), the other half about sed ‘s/-j”%{jobs}”/-j”%{other_var}”/‘
or just sed ‘s/-j”%{jobs}”/-j2/‘
(over find . -name ‘*.opam’
) and then using opam -j1
to get your “1 package, 2 threads” build
patching the packages might be of separate interest, since this squaring is not a perfect solution anyway.
of course such a sed
is very error-prone — you can use sed
, review the changes and iterate, or do them entirely by hand, or anything in between.
Make's jobserver is the correct solution and it isn't even that hard. (Opam would just need to open a pipe, fill it with nb-jobs bytes, define an environment variable with the name of the pipe, and forget about it.)
Indeed, @Janno argued the same (I should have tagged him earlier); I was a bit skeptical, but https://www.gnu.org/software/make/manual/html_node/Job-Slots.html#Job-Slots makes it sound more plausible.
The difficult part would be if Opam itself wanted to act accordingly to the job server, but that is a separate issue.
@Guillaume Melquiond : can you elaborate a bit on your last point?
What I wonder is how this works together with the different make systems packages use. There are dune, make, remake, cmake and maybe a few more plus the system package managers which sometimes build from sources.
I guess for the time being I will stick to the package sequential build with -j=2. In the Windows installer generator I have some scripts to sort packages by their depth in the dependency tree - maybe I use this. But I guess before making things complicated I should just try if the sequential build still works on low memory VMs. There are only few packages which need > 1GB RAM per coqc.
@Guillaume Melquiond : would it be possible to create the pipe and environment variable on the level of the script which calls opam?
My last point is related to the way the job server works. The idea is that, whenever a build system wants to start a job, it needs to consume a byte (and whenever it ends, it needs to put one back). More precisely, when a build system (e.g., Make) starts, it assumes that a byte was already consumed (otherwise it would not have started), so it will not consume another byte for its first job. So, if there are three bytes in the pipe, if two instances of Make are run, you will end up with up to five jobs running in parallel. If you really wanted only three jobs in total, Opam should have consumed one byte before starting each instance of Make, so that only one byte is left in the pipe. That is the part of the implementation that is a bit more invasive.
Anyway, yes, setting up the pipe in the script should work. But what I just said still holds. If you put three bytes in pipe and call Opam with -j3
, you might still end up with six jobs in total, since Opam will not consume any byte.
As for the build systems that support the job server, Make and Cmake do. Remake does not, but that could be the incentive for me to finally implement it. Dune does not, but there is an open pull request for it. And it seems there are plans for Opam too: https://github.com/ocaml/opam/wiki/Spec-for-GNU-make-jobserver-support
Last updated: Jun 03 2023 at 05:01 UTC