I wonder if we should have a Coq Universe based Coq Platform build - that is that users can choose if they want to use opam or dune as top level build manager. Although the two projects seem way parallel at first glance, this might not be so. I see Coq Platform mostly as a release management and social engineering project and Coq Universe mostly as a build technology project. But there is of course overlap, like providing a cross project CI.
@Michael Soegtrop it shouldn't matter much IMHO, as at some point many opam packages will use Dune. Coq universe is more important where you need incremental builds, then, it is just much faster (and cached) than the other alternatives. For example CI, working on several projects at the same time, etc...
I definitely see the advantages of dune, and I expect that e.g. plugin developers will switch to use dune as top level build manager. What I want to avoid is that developers and users and/or release managers drift apart by using different build systems.
As far as I understand Coq Universe is - as is now - for building master branch code. It might be of advantage if developers could easily build on master and the latest / upcoming / older Coq Platform release with the same level of convenience for developers.
@Michael Soegtrop I started Coq universe building with 8.15 , in principle there is no problem having versioned branchs and we expect to do so
We are just waiting to have our submodules tooling a bit more advanced so we can automate that
But indeed if anyone wants to do a 8.15/8.16 branch for Coq universe absolutely go ahead and submit a pull request
what to add to the Coq universe build is mainly controlled by the top level (dirs ...)
stanza in the dune
file
Indeed, there should be no drifting apart because opam packages will gradually all start relying on Dune internally (so there should eventually be virtually just one build system). Using coq-universe directly instead of through opam can bring faster builds though, so this may be a good thing to (eventually) use it for building the binary installers. As for providing a coq-universe option for users of the interactive scripts, if you do not provide them an opam root, then their installation is not customizable, so it's not really useful to go through the interactive scripts.
What I am after is an easy way to build exactly the same thing as Coq Platform XYZ with dune as top level build manager.
Michael Soegtrop said:
What I am after is an easy way to build exactly the same thing as Coq Platform XYZ with dune as top level build manager.
Would it be possible to just point coq-platform at a particular commit hash of coq-universe as the source for coq packages and coq itself? It seems like coq-platform builds a lot of stuff using adhoc scripts, and it wouldn't be very advantageous to port those to dune.
No, Coq Platform doesn't build that much with ad hoc scripts. It just wraps around opam.
And the packages in the Coq Platform are curated. So it wouldn't be appropriate to point to a set of packages that can evolve arbitrarily with time.
A commit hash in coq-universe would fully specify every single package in it I would think
Yes. But I am talking about the package set here, not the package versions.
To be in the Platform, maintainers have to agree to abide by a charter, so not all packages that build at a certain point in time should be included.
I see. Similar to stackage.
Yes.
Furthermore, depending on the way the versions in the coq-universe are updated, it might be never the case that a certain coq-universe commit contains all the released versions that were blessed by package maintainers.
In any case, what I meant that it would be possible for coq platform to use dune instead of opam to do its build work if the set of curated packages is only ported to dune.
The platform gives you a switch, you can the add extra stuff if you like. OPAM gives you search, downloads, and build. It also lets you apply patches, list the installed files to build binary installers.
The platform is a distribution. Opam is like apt. Dune is like make. IMO this thread is comparing oranges with apples.
@Rudi Grinberg Coq Platform is mostly a web of mutual benefit social agreements to maintain a largish collection os software which results in a collection of packages and package versions which are tested to work with each other and this way define a standard platform for 6 months or so. Really > 90% of the work is organizing a release.
How one actually builds this very specific set of software is not something Coq Platform is religious about - we try to support various methods for people with different backgrounds / objectives.
Coq Universe is a new and exciting build/installation method that would make sense to support.
The only requirement besides that this build method is of advantage for a non negligible fraction of the Coq users (which I believe is the case) would be that one can control the collection of packages and package versions (that is git tags) used.
In any case, what I meant that it would be possible for coq platform to use dune instead of opam to do its build work if the set of curated packages is only ported to dune.
Yes that's also what I meant: let's start with the subset which already uses dune, let's use the mentioned "social web" to get more things into dune and let's write the - likely not very complicated - scripts which explain to Coq Universe what to build exactly.
There could be many ways of doing this, e.g. a script which automatically creates a set of Coq Universe tags matching Coq Platform releases and picks or a script which sets up some local files which instruct dune. My dune knowledge is very limited yet, so I can't tell what makes sense.
if I get it right, you should be able to for pkg in COQ_PACKAGES; do opam source $pkg
and then just run dune build
from the root directory. This would use opam as the pinning mechanism, rather than git submodules, and build a little faster since you would get parallelism/scheduling at the file level rather than package level.
Still this only works if the build instructions in the opam packages are patch free and just call dune. If they do more stuff it won't work.
I think opam would do the patching when opam source is called ...
opam source
should supply the sources as opam builds them.
This seems to require local patches, but they can be upstreamed once you can rely on composition support in dune, I presume?
@Paolo Giarrusso : can you please elaborate on that? I can't say I understand what you mean.
All packaging systems I know of let the package maintainer add patches. Patches are also typically sent upstream, but you can't possibly hope that ref TAG+1 is the released tag you want to package plus your packaging patch. Of course you could craft a git commit with what you want and have the git submodule point to that...
@Michael Soegtrop this takes a moment to explan, but "composition support in dune" is https://github.com/ocaml/dune/pull/5784 so maybe it'll be merged soon?
With that, if A depends on B, A's dune config can mention B, and dune will do the right thing both when B is installed and when A and B are "built together"; Coq Universe relies on the latter, AFAICT.
Today, instead, dependencies only work when A and B are "built together", so upstream might omit dependencies on libraries that are already installed.
Ah OK, I was assuming that this works already.
But I don't quite understand how this is connected to the patching question.
Ah, if upstream omits some dependencies, Coq Universe would have to patch them in
I think Michael and myself are talking about the patches opam packages can apply on top of the upstream tarballs
These are the ones one can upstream, etc, as I was talking about
It sometimes came handy in the past to patch upstream sources so that they work fine in the platform
Yes, it e.g. happened that we did a minimal patch up of the latest release to make it compatible with the latest Coq in case the package release was not yet ready - typically a cherry pick from dev on top of the latest release.
Anyway, @Michael Soegtrop I think you are right, opam source would apply the patches, so dune would compile the patched code.
ah. I only meant that for now Coq universe seems likely to require extra patches to dune configs, anyway. But as you say, that'll work.
Yes, that is another story. On mathcomp it seems to require a patch to dune itself, so I guess we are not ready for prime time
OK, so this looks like something I can try - it looks simple enough ...
mathcomp has no dune file, but there is a PR adding it, but apparently it also requires a branch of dune to run CI... so all that part of the platform is not going to just work I'm afraid.
Well, I am not discussing next week here. My nearest time frame would be Coq Platform 2022.09. Maybe it can be supported for a good fraction of the packages by then. If so I will include it.
But I will try how far I get and report back here.
My main question here is what does the platform gain. I mean, if you machine is beefy, parallelism at the granularity of opam packages seems a lot already.
This huge composed build gives you parallelism at the file level... it is going to change that much?
Reducing the number of "universe" builds has value in itself for users
What do you mean?
at least IMHO, because of extra simplicity
but more concretely, opam build -j N
isn't great, since it will run between 1 and N^2 processes at any time — while dune can run a constant N
@Enrico Tassi : The big issue with Opam parallelism is that most opam files use the same parameter for opam project parallelism and make level parallelism, so that the effect of the parameter is square. With the memory requirements of Coq this can kill most machines.
I see. Does dune take memory pressure into account (or just CPU count)?
For sure not going above N coq processes on N core is going to help
I think on most machines memory matches the CPU count well (assuming 0.5...1GB per Coq thread). But with opam you can temporarily end up with 40 or so Coq threads, and that is a problem.
Btw.: I think you get best speed with a few more threads than you have cores - this way CPU can be used if a thread is waiting for disk IO.
Another thing is that in an opam Coq Platform build always 2 projects remain in the end with little parallelism used (VST and UniMath). I can imagine that dune can handle this better - to be seen.
Building the platform in batch with dune doesn’t sound that appealing, the advantages are evident when you want an incremental build. For example, you make a change to mathcomp and want to see how it affects the rest of the platform. That’s when dune will become far better than opam
There’s probably an order of magnitude of a speed up here.
Then there’s the cache, which is another order of magnitude. If the coq rules are ready for it of course
I've been using the dune cache with coq for years :-)
hyperbolically speaking 99% of the time the dune cache is flawless, 0.5% I've screwed up and added overlapping logical paths (but they don't always fail), and the rest is covered by bug reports
We're starting to push the cache more aggressively on users. For example, we'd like to cache all sandboxed rules by default in the next version of dune.
Rudi Grinberg said:
Building the platform in batch with dune doesn’t sound that appealing
Why? I described the substantial issues with parallel opam builds above. I would think that dune can solve these in a satisfactory way.
Sure, but I would expect a speed up like 2-5x from doing this. Not all that impressive
It seems we agree there are multiple advantages of a dune build :-)
I think even 2x is worthwhile, and higher speedups are even better reasons
Btw, from my benchmarks the overhead from opam builds is largely all the additional copying that needs to be done as part of the install step. Not necessarily the bad threads/process usage
I hear filesystems on windows are also particularly slow
For my mac, the instructions recommend -j16
and produce load averages around 200-300 — while I think that number is inflated, it can’t be the optimal use of a computer, and I would not try on lesser machines
Yeah, those things are of course not independent. All the waiting for the copying step prevents opam from running dune build as fast it should
Either way, I am not sure how helpful this subthread is :-), if we agree the various advantages are worthwhile
Both incremental and cached builds would be very useful, and we would get the batch build for free
Indeed. We need to think about how to get the next order of magnitude speed up next :)
Rudi Grinberg said:
Sure, but I would expect a speed up like 2-5x from doing this. Not all that impressive
Well a speed up of 2x in CI would help me quite a bit in preparing Coq Platform releases ... Currently a CI run takes 6 hours or so - assuming I get Mac and Windows runners immediately. This makes proper continuous integration procedures (wait for CI before merge) infeasible in the hot phase of a release.
I don't think the workers we get from GH actions have many CPUs, unfortunately.
I totally agree that a 2x speedup would be awesome (but I really doubt that we can get that)
That suggests it’s worth using dune caching in CI?
That is, asking CI to cache ~/.cache/dune
should already have nontrivial benefits. Such caching will not do the right thing if you have parallel jobs, but it should still be vastly better than nothing
I know that on Travis one can easily keep such folders between CI runs, but I wonder how one would do this on GitHub.
google says: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows
- name: Cache Gradle packages
uses: actions/cache@v3
with:
path: |
~/.gradle/caches
~/.gradle/wrapper
Preserving a dune cache would help even if you build with opam I guess (that calls dune behind the scenes on each package, assuming the package uses dune).
You could even preserve the opam switch with what was successfully installed before (and pass the platform script the option to not wipe it). That is orthogonal to dune cache.
In particular, it’s very useful to share the downloaded sources between opam runs. I think the opam GitHub action already does that
I don’t remember the details, but opam’s sandboxing interferes with dune’s caching. I’m sure there’s workarounds though
https://github.com/coq/coq/issues/15138 is (one of) the issues
OCaml github action already supports Dune cache
tho if you bump Coq it is not very useful
it helps great when you bump some more terminal lib
(BTW, we’re also experimenting with using rsync to “merge” caches, as I think Emilio and Rudi had suggested ages ago; that’s probably harder to do on cloud CI)
Last updated: Sep 15 2024 at 12:01 UTC