Stream: Coq Universe

Topic: Should we have a Coq Universe based Coq Platform build?


view this post on Zulip Michael Soegtrop (Jun 07 2022 at 07:15):

I wonder if we should have a Coq Universe based Coq Platform build - that is that users can choose if they want to use opam or dune as top level build manager. Although the two projects seem way parallel at first glance, this might not be so. I see Coq Platform mostly as a release management and social engineering project and Coq Universe mostly as a build technology project. But there is of course overlap, like providing a cross project CI.

view this post on Zulip Emilio Jesús Gallego Arias (Jun 07 2022 at 07:26):

@Michael Soegtrop it shouldn't matter much IMHO, as at some point many opam packages will use Dune. Coq universe is more important where you need incremental builds, then, it is just much faster (and cached) than the other alternatives. For example CI, working on several projects at the same time, etc...

view this post on Zulip Michael Soegtrop (Jun 07 2022 at 07:54):

I definitely see the advantages of dune, and I expect that e.g. plugin developers will switch to use dune as top level build manager. What I want to avoid is that developers and users and/or release managers drift apart by using different build systems.
As far as I understand Coq Universe is - as is now - for building master branch code. It might be of advantage if developers could easily build on master and the latest / upcoming / older Coq Platform release with the same level of convenience for developers.

view this post on Zulip Emilio Jesús Gallego Arias (Jun 07 2022 at 07:57):

@Michael Soegtrop I started Coq universe building with 8.15 , in principle there is no problem having versioned branchs and we expect to do so

view this post on Zulip Emilio Jesús Gallego Arias (Jun 07 2022 at 07:57):

We are just waiting to have our submodules tooling a bit more advanced so we can automate that

view this post on Zulip Emilio Jesús Gallego Arias (Jun 07 2022 at 07:58):

But indeed if anyone wants to do a 8.15/8.16 branch for Coq universe absolutely go ahead and submit a pull request

view this post on Zulip Emilio Jesús Gallego Arias (Jun 07 2022 at 07:59):

what to add to the Coq universe build is mainly controlled by the top level (dirs ...) stanza in the dune file

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 08:24):

Indeed, there should be no drifting apart because opam packages will gradually all start relying on Dune internally (so there should eventually be virtually just one build system). Using coq-universe directly instead of through opam can bring faster builds though, so this may be a good thing to (eventually) use it for building the binary installers. As for providing a coq-universe option for users of the interactive scripts, if you do not provide them an opam root, then their installation is not customizable, so it's not really useful to go through the interactive scripts.

view this post on Zulip Michael Soegtrop (Jun 07 2022 at 08:33):

What I am after is an easy way to build exactly the same thing as Coq Platform XYZ with dune as top level build manager.

view this post on Zulip Rudi Grinberg (Jun 07 2022 at 17:25):

Michael Soegtrop said:

What I am after is an easy way to build exactly the same thing as Coq Platform XYZ with dune as top level build manager.

Would it be possible to just point coq-platform at a particular commit hash of coq-universe as the source for coq packages and coq itself? It seems like coq-platform builds a lot of stuff using adhoc scripts, and it wouldn't be very advantageous to port those to dune.

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:27):

No, Coq Platform doesn't build that much with ad hoc scripts. It just wraps around opam.

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:27):

And the packages in the Coq Platform are curated. So it wouldn't be appropriate to point to a set of packages that can evolve arbitrarily with time.

view this post on Zulip Rudi Grinberg (Jun 07 2022 at 17:28):

A commit hash in coq-universe would fully specify every single package in it I would think

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:29):

Yes. But I am talking about the package set here, not the package versions.

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:30):

To be in the Platform, maintainers have to agree to abide by a charter, so not all packages that build at a certain point in time should be included.

view this post on Zulip Rudi Grinberg (Jun 07 2022 at 17:30):

I see. Similar to stackage.

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:30):

Yes.

view this post on Zulip Théo Zimmermann (Jun 07 2022 at 17:31):

Furthermore, depending on the way the versions in the coq-universe are updated, it might be never the case that a certain coq-universe commit contains all the released versions that were blessed by package maintainers.

view this post on Zulip Rudi Grinberg (Jun 07 2022 at 17:31):

In any case, what I meant that it would be possible for coq platform to use dune instead of opam to do its build work if the set of curated packages is only ported to dune.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 05:02):

The platform gives you a switch, you can the add extra stuff if you like. OPAM gives you search, downloads, and build. It also lets you apply patches, list the installed files to build binary installers.

The platform is a distribution. Opam is like apt. Dune is like make. IMO this thread is comparing oranges with apples.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 07:33):

@Rudi Grinberg Coq Platform is mostly a web of mutual benefit social agreements to maintain a largish collection os software which results in a collection of packages and package versions which are tested to work with each other and this way define a standard platform for 6 months or so. Really > 90% of the work is organizing a release.

How one actually builds this very specific set of software is not something Coq Platform is religious about - we try to support various methods for people with different backgrounds / objectives.

Coq Universe is a new and exciting build/installation method that would make sense to support.
The only requirement besides that this build method is of advantage for a non negligible fraction of the Coq users (which I believe is the case) would be that one can control the collection of packages and package versions (that is git tags) used.

In any case, what I meant that it would be possible for coq platform to use dune instead of opam to do its build work if the set of curated packages is only ported to dune.

Yes that's also what I meant: let's start with the subset which already uses dune, let's use the mentioned "social web" to get more things into dune and let's write the - likely not very complicated - scripts which explain to Coq Universe what to build exactly.

There could be many ways of doing this, e.g. a script which automatically creates a set of Coq Universe tags matching Coq Platform releases and picks or a script which sets up some local files which instruct dune. My dune knowledge is very limited yet, so I can't tell what makes sense.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 07:57):

if I get it right, you should be able to for pkg in COQ_PACKAGES; do opam source $pkg and then just run dune build from the root directory. This would use opam as the pinning mechanism, rather than git submodules, and build a little faster since you would get parallelism/scheduling at the file level rather than package level.

Still this only works if the build instructions in the opam packages are patch free and just call dune. If they do more stuff it won't work.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 08:16):

I think opam would do the patching when opam source is called ...

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 08:17):

opam source should supply the sources as opam builds them.

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 08:55):

This seems to require local patches, but they can be upstreamed once you can rely on composition support in dune, I presume?

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 08:56):

@Paolo Giarrusso : can you please elaborate on that? I can't say I understand what you mean.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:00):

All packaging systems I know of let the package maintainer add patches. Patches are also typically sent upstream, but you can't possibly hope that ref TAG+1 is the released tag you want to package plus your packaging patch. Of course you could craft a git commit with what you want and have the git submodule point to that...

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:03):

@Michael Soegtrop this takes a moment to explan, but "composition support in dune" is https://github.com/ocaml/dune/pull/5784 so maybe it'll be merged soon?

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:05):

With that, if A depends on B, A's dune config can mention B, and dune will do the right thing both when B is installed and when A and B are "built together"; Coq Universe relies on the latter, AFAICT.

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:07):

Today, instead, dependencies only work when A and B are "built together", so upstream might omit dependencies on libraries that are already installed.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:07):

Ah OK, I was assuming that this works already.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:08):

But I don't quite understand how this is connected to the patching question.

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:08):

Ah, if upstream omits some dependencies, Coq Universe would have to patch them in

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:09):

I think Michael and myself are talking about the patches opam packages can apply on top of the upstream tarballs

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:09):

These are the ones one can upstream, etc, as I was talking about

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:10):

It sometimes came handy in the past to patch upstream sources so that they work fine in the platform

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:11):

Yes, it e.g. happened that we did a minimal patch up of the latest release to make it compatible with the latest Coq in case the package release was not yet ready - typically a cherry pick from dev on top of the latest release.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:12):

Anyway, @Michael Soegtrop I think you are right, opam source would apply the patches, so dune would compile the patched code.

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:12):

ah. I only meant that for now Coq universe seems likely to require extra patches to dune configs, anyway. But as you say, that'll work.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:13):

Yes, that is another story. On mathcomp it seems to require a patch to dune itself, so I guess we are not ready for prime time

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:14):

OK, so this looks like something I can try - it looks simple enough ...

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:15):

mathcomp has no dune file, but there is a PR adding it, but apparently it also requires a branch of dune to run CI... so all that part of the platform is not going to just work I'm afraid.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:16):

Well, I am not discussing next week here. My nearest time frame would be Coq Platform 2022.09. Maybe it can be supported for a good fraction of the packages by then. If so I will include it.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:17):

But I will try how far I get and report back here.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:17):

My main question here is what does the platform gain. I mean, if you machine is beefy, parallelism at the granularity of opam packages seems a lot already.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:18):

This huge composed build gives you parallelism at the file level... it is going to change that much?

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:19):

Reducing the number of "universe" builds has value in itself for users

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:19):

What do you mean?

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:20):

at least IMHO, because of extra simplicity

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 09:22):

but more concretely, opam build -j N isn't great, since it will run between 1 and N^2 processes at any time — while dune can run a constant N

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:22):

@Enrico Tassi : The big issue with Opam parallelism is that most opam files use the same parameter for opam project parallelism and make level parallelism, so that the effect of the parameter is square. With the memory requirements of Coq this can kill most machines.

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:23):

I see. Does dune take memory pressure into account (or just CPU count)?

view this post on Zulip Enrico Tassi (Jun 08 2022 at 09:24):

For sure not going above N coq processes on N core is going to help

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:24):

I think on most machines memory matches the CPU count well (assuming 0.5...1GB per Coq thread). But with opam you can temporarily end up with 40 or so Coq threads, and that is a problem.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:27):

Btw.: I think you get best speed with a few more threads than you have cores - this way CPU can be used if a thread is waiting for disk IO.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 09:29):

Another thing is that in an opam Coq Platform build always 2 projects remain in the end with little parallelism used (VST and UniMath). I can imagine that dune can handle this better - to be seen.

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 15:44):

Building the platform in batch with dune doesn’t sound that appealing, the advantages are evident when you want an incremental build. For example, you make a change to mathcomp and want to see how it affects the rest of the platform. That’s when dune will become far better than opam

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 15:45):

There’s probably an order of magnitude of a speed up here.

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 15:46):

Then there’s the cache, which is another order of magnitude. If the coq rules are ready for it of course

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 16:44):

I've been using the dune cache with coq for years :-)

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 16:46):

hyperbolically speaking 99% of the time the dune cache is flawless, 0.5% I've screwed up and added overlapping logical paths (but they don't always fail), and the rest is covered by bug reports

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 17:05):

We're starting to push the cache more aggressively on users. For example, we'd like to cache all sandboxed rules by default in the next version of dune.

view this post on Zulip Michael Soegtrop (Jun 08 2022 at 19:24):

Rudi Grinberg said:

Building the platform in batch with dune doesn’t sound that appealing

Why? I described the substantial issues with parallel opam builds above. I would think that dune can solve these in a satisfactory way.

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 19:29):

Sure, but I would expect a speed up like 2-5x from doing this. Not all that impressive

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 19:29):

It seems we agree there are multiple advantages of a dune build :-)

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 19:30):

I think even 2x is worthwhile, and higher speedups are even better reasons

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 19:30):

Btw, from my benchmarks the overhead from opam builds is largely all the additional copying that needs to be done as part of the install step. Not necessarily the bad threads/process usage

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 19:31):

I hear filesystems on windows are also particularly slow

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 19:33):

For my mac, the instructions recommend -j16 and produce load averages around 200-300 — while I think that number is inflated, it can’t be the optimal use of a computer, and I would not try on lesser machines

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 19:35):

Yeah, those things are of course not independent. All the waiting for the copying step prevents opam from running dune build as fast it should

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 19:35):

Either way, I am not sure how helpful this subthread is :-), if we agree the various advantages are worthwhile

view this post on Zulip Paolo Giarrusso (Jun 08 2022 at 19:36):

Both incremental and cached builds would be very useful, and we would get the batch build for free

view this post on Zulip Rudi Grinberg (Jun 08 2022 at 19:47):

Indeed. We need to think about how to get the next order of magnitude speed up next :)

view this post on Zulip Michael Soegtrop (Jun 09 2022 at 09:04):

Rudi Grinberg said:

Sure, but I would expect a speed up like 2-5x from doing this. Not all that impressive

Well a speed up of 2x in CI would help me quite a bit in preparing Coq Platform releases ... Currently a CI run takes 6 hours or so - assuming I get Mac and Windows runners immediately. This makes proper continuous integration procedures (wait for CI before merge) infeasible in the hot phase of a release.

view this post on Zulip Enrico Tassi (Jun 09 2022 at 09:19):

I don't think the workers we get from GH actions have many CPUs, unfortunately.

view this post on Zulip Enrico Tassi (Jun 09 2022 at 09:20):

I totally agree that a 2x speedup would be awesome (but I really doubt that we can get that)

view this post on Zulip Paolo Giarrusso (Jun 09 2022 at 10:11):

That suggests it’s worth using dune caching in CI?

view this post on Zulip Paolo Giarrusso (Jun 09 2022 at 10:15):

That is, asking CI to cache ~/.cache/dune should already have nontrivial benefits. Such caching will not do the right thing if you have parallel jobs, but it should still be vastly better than nothing

view this post on Zulip Michael Soegtrop (Jun 09 2022 at 10:22):

I know that on Travis one can easily keep such folders between CI runs, but I wonder how one would do this on GitHub.

view this post on Zulip Enrico Tassi (Jun 09 2022 at 11:07):

google says: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

view this post on Zulip Enrico Tassi (Jun 09 2022 at 11:07):

- name: Cache Gradle packages
  uses: actions/cache@v3
  with:
    path: |
      ~/.gradle/caches
      ~/.gradle/wrapper

view this post on Zulip Enrico Tassi (Jun 09 2022 at 11:09):

Preserving a dune cache would help even if you build with opam I guess (that calls dune behind the scenes on each package, assuming the package uses dune).

view this post on Zulip Enrico Tassi (Jun 09 2022 at 11:10):

You could even preserve the opam switch with what was successfully installed before (and pass the platform script the option to not wipe it). That is orthogonal to dune cache.

view this post on Zulip Rudi Grinberg (Jun 09 2022 at 14:06):

In particular, it’s very useful to share the downloaded sources between opam runs. I think the opam GitHub action already does that

view this post on Zulip Rudi Grinberg (Jun 09 2022 at 14:08):

I don’t remember the details, but opam’s sandboxing interferes with dune’s caching. I’m sure there’s workarounds though

view this post on Zulip Paolo Giarrusso (Jun 09 2022 at 18:44):

https://github.com/coq/coq/issues/15138 is (one of) the issues

view this post on Zulip Emilio Jesús Gallego Arias (Jun 09 2022 at 19:29):

OCaml github action already supports Dune cache

view this post on Zulip Emilio Jesús Gallego Arias (Jun 09 2022 at 19:29):

tho if you bump Coq it is not very useful

view this post on Zulip Emilio Jesús Gallego Arias (Jun 09 2022 at 19:29):

it helps great when you bump some more terminal lib

view this post on Zulip Paolo Giarrusso (Jun 09 2022 at 19:33):

(BTW, we’re also experimenting with using rsync to “merge” caches, as I think Emilio and Rudi had suggested ages ago; that’s probably harder to do on cloud CI)


Last updated: Feb 06 2023 at 05:03 UTC