recent changes to the native compiler? · Coq devs & plugin devs

fiat-parsers stack overflows on compiling files like src/Parsers/GenericRecognizerOptimized because the ocaml native compiler stack overflows when generating the native files for the .vo file. Was there a recent change to Coq master (within the past month or so) which would explain this change? (For example, does it force early native compilation by default now where it used to not?)

Gaëtan Gilbert (Dec 07 2020 at 22:34):

Jason Gross (Dec 08 2020 at 01:08):

Christian Doczkal (Dec 08 2020 at 18:48):

- COQC theories/open_confluence.v
- Fatal error: exception Stack overflow
- Error: Native compiler exited with status 2

However, I had successful builds less than a week ago and this PR was merged 18 days ago.

Théo Zimmermann (Dec 09 2020 at 09:50):

If native-compute doesn't work for your project, you should disable it explicitly.

Théo Zimmermann (Dec 09 2020 at 09:51):

Théo Zimmermann (Dec 09 2020 at 09:52):

Basically, this should amount to adding an explicit -arg -native-compiler -arg no to your _CoqProject file.

Karl Palmskog (Dec 09 2020 at 11:20):

this seems like a doubtful way (overall approach) to handle incompatibility with native compilation. If I just have a regular Coq library, why would I even risk having users complain about features I can't control? It seems like all library devs will want to add this preemptively.

Karl Palmskog (Dec 09 2020 at 11:21):

my view is something like: everyone who installs coq-native is completely on their own, and can't expect any support from any library devs

Karl Palmskog (Dec 09 2020 at 11:22):

in absence of a message to this effect when installing coq-native, it seems prudent to just disallow native compilation by all library maintainers

Guillaume Melquiond (Dec 09 2020 at 12:05):

I am note sure to understand which is your point of view. Either we consider users who install coq-native are on their own and then library maintainers do not have anything to do a priori (and not even a posteriori). Or we consider that all the libraries need to support this use case (is there a CI job to that effect?), possibly by disabling native compilation.

Guillaume Melquiond (Dec 09 2020 at 12:06):

Anyway, coq-native is not installed by default; it requires an explicit action from the user. So, hopefully they know what to expect. (Am I too optimistic? Are there users who install packages they do not understand?)

Pierre Roux (Dec 09 2020 at 12:27):

The confusion seems to come from the fact that coq-native has been added to the docker images used by many people for CI.

Pierre Roux (Dec 09 2020 at 12:29):

I guess, this is a good opportunity to discover that some libraries do not handle native-compute. This information will be needed anyway on the opam repo with the future split approach.

Karl Palmskog (Dec 09 2020 at 12:37):

Here is how it works: user X installs coq and coq-native, then tries to install a bunch of libraries, then opens lots of issues complaining about failing native compilation. This is a net negative for library maintainers, who have to deal with the complaints. Why would any library maintainer who is not using some fancy reflection tactic ever want their library to be compiled with native? The announcement and CEP give zero evidence that the general Coq library gets any benefits whatsoever.

Karl Palmskog (Dec 09 2020 at 12:39):

personally, I'm going to add -arg -native-compiler -arg no to _CoqProject in all my projects, and also advise everyone else to do the same.

Pierre Roux (Dec 09 2020 at 12:43):

So there is no point in disabling it if this works (most cases in the CI of Coq).

Karl Palmskog (Dec 09 2020 at 12:44):

the point is exactly as I described: we don't want users to report issues or bother maintainers when native-compute breaks everything, which it can be expected to do a lot per above

Pierre Roux (Dec 09 2020 at 12:44):

By comparison to OCaml, you don't want to forbid compilation with ocamlopt except if you know there is a specific problem (even if yourself are only using ocamlc).

Karl Palmskog (Dec 09 2020 at 12:45):

the difference here is that ocamlopt at least seems to work on a majority of projects, and the benefits are known (speedups have been measured and are substantial)

Pierre Roux (Dec 09 2020 at 12:45):

This is also the case, although I agree, the majority is smaller (rather a large majority than almost all).

Karl Palmskog (Dec 09 2020 at 13:10):

Pierre-Marie Pédrot (Dec 09 2020 at 13:12):

Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

Pierre Roux (Dec 09 2020 at 13:13):

Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

Yes, but you pay for it when using native_compute anyways since it incurs a small compilation phase.

Pierre-Marie Pédrot (Dec 09 2020 at 13:14):

Karl Palmskog (Dec 09 2020 at 13:16):

then, the "selling proposition" of allowing native-compute for an average Coq library seems terrible as of right now

Pierre-Marie Pédrot (Dec 09 2020 at 13:20):

I am convinced that with the current infrastructure we're bound to displease everybody.

Pierre-Marie Pédrot (Dec 09 2020 at 13:21):

That's why I think that we should have separate native compilation as soon as possible, so that if you're a user of native compute, things should be transparent for you without incurring a cost on non-users.

Karl Palmskog (Dec 09 2020 at 13:23):

my view: allowing native-compute to happen for a Coq library should be completely opt-in by the maintainer regardless of the version of Coq and regardless of how Coq was installed (possibly with some way for expert users to override)

Théo Zimmermann (Dec 09 2020 at 13:30):

Our point was precisely that by including coq-native in Docker images, maintainers would discover that their projects are not compatible with native-compute and could disable it (which will also avoid complaints from users if they try the same later on).

Théo Zimmermann (Dec 09 2020 at 13:31):

I don't see the point of disabling native-compute if your CI tells you that it works with it (which it tells you if you are testing with 8.13 or dev right now and it doesn't fail).

Théo Zimmermann (Dec 09 2020 at 13:31):

Karl Palmskog (Dec 09 2020 at 13:32):

Théo Zimmermann (Dec 09 2020 at 13:35):

I don't understand what is so bad in having the default CI setting automatically check compatibility with a standard Coq feature until you explicitly disable it.

Théo Zimmermann (Dec 09 2020 at 13:36):

I mean the only complaint that I do understand (but it hasn't really been voiced) is that it is not clear enough how to deal with these new failures right now.

Karl Palmskog (Dec 09 2020 at 13:36):

per Pierre-Marie above, it has substantial overhead, and nobody has shown it has any benefits for the average project

Karl Palmskog (Dec 09 2020 at 13:36):

Karl Palmskog (Dec 09 2020 at 13:38):

I wonder how OCaml project maintainers would react if -flambda was suddenly the default in all their compilers

Pierre-Marie Pédrot (Dec 09 2020 at 13:42):

I don't know what's the plan for multicore but I am slightly afraid it'll indeed be shoved down our OCaml-user throat.

Emilio Jesús Gallego Arias (Dec 09 2020 at 13:58):

The downsides of the current approach were discussed and indeed they do imply that docker images have to be provided both with native and non native support enabled by default.

As a quick fix tho I would have the default docker image to increase stack size using ulimit , that should greatly reduce the amount of StackOverflows users find.

Emilio Jesús Gallego Arias (Dec 09 2020 at 13:59):

Most compatible would indeed to have the default docker image setup with -native no and provide a coq+native image for those wanting it.

Guillaume Melquiond (Dec 09 2020 at 14:03):

There is no relation. For example, if you forbid native compilation for Coquelicot (which does not use any native-worth reflection), then you forbid it for CoqInterval (for which it is certainly worth it).

Guillaume Melquiond (Dec 09 2020 at 14:06):

Keep in mind that a user who wants to use native_compute has a use case for it. For example, I am willing to pay the compilation overhead (a few seconds at most) for the interval tactic, because I am then running it on some goals for several minutes, if not longer. So, the x6 speedup is certainly worth it.

Christian Doczkal (Dec 09 2020 at 14:14):

This is precisely the issue for me here. I get a "random" stack overflow from one day to the next when trying to build my library using the coq images. Without any further information, I tried to come here to see what's going wrong. Now, even after reading the post on discourse I still have no idea. I'm seriously tempted to just to what @Karl Palmskog suggests and disable it.

But how does such a setting affect users of ones library that have coq-native installed? Will they still be able to use the library, even though it has not been built for native comutation?

Théo Zimmermann (Dec 09 2020 at 14:15):

Théo Zimmermann (Dec 09 2020 at 14:16):

Yes, users will still be able to use the library even if they have coq-native installed.

Théo Zimmermann (Dec 09 2020 at 14:17):

By disabling you prevent a compilation failure while they try to install the library via opam BTW (the whole point of making CI test this by default).

Guillaume Melquiond (Dec 09 2020 at 14:18):

However, it forces libraries that depend on yours to also disable native compilation, and so on transitively.

Théo Zimmermann (Dec 09 2020 at 14:21):

Karl Palmskog (Dec 09 2020 at 14:21):

so the whole Coq ecosystem is supposed to run into these stack overflows, then figure out for themselves that they need to add the disable command into _CoqProject?

Karl Palmskog (Dec 09 2020 at 14:23):

this is to me terrible distribution of knowledge, most people (nearly everyone) have no idea about the intricate tradeoffs discussed by Guillaume, and are not in a position to make choices for themselves on coq-native

Enrico Tassi (Dec 09 2020 at 14:24):

Karl Palmskog (Dec 09 2020 at 14:25):

Enrico Tassi (Dec 09 2020 at 14:25):

Christian Doczkal (Dec 09 2020 at 14:26):

Yes, that would be my expectation as well. Something that breaks stuff for many people out there with gains for few should not be the default IMHO.

Guillaume Melquiond (Dec 09 2020 at 14:28):

Can we have an actual estimation of what "many people" means here? For now, I have seen only three complaints, but maybe I missed some.

Karl Palmskog (Dec 09 2020 at 14:29):

maybe someone can then start to go through the released coq opam archive and see how coq-native works out?

Christian Doczkal (Dec 09 2020 at 14:30):

Indeed, currently this only hits people who regularly build against the coqorg/coq:dev image.

Karl Palmskog (Dec 09 2020 at 14:31):

for a new default not known to many people outside the dev team, it seems strange to draw conclusions so early from existence/non-existence of complaints

Karl Palmskog (Dec 09 2020 at 14:32):

I think a new proposed default should have the burden of showing that the "average project" is not hindered/bothered by it

Guillaume Melquiond (Dec 09 2020 at 14:33):

Yes, so let us stop discussing about it until we can actually draw some conclusion. This is just pointless.

Emilio Jesús Gallego Arias (Dec 09 2020 at 14:41):

No this is not pointless, some of us have been working having native supported more widely for a while [including fixing lambda which provides another 20% in some cases] and we had a good understanding; you don't need to have a lot of adivinatory powers as just inspecting all the breakage that this default caused in Coq's CI in enough to predict the wider problems for a large set of libraries if the default was set to native=yes. That seems the wrong choice for Docker.

In fact, all these problems were predicted in the discussion, the choice was to ignore them, but that doesn't mean they don't exists. Karl's point is much valid and this new setup is a disaster for users, and even a lack of respect given the time they will have to spend to figure things out. Many of them won't even know what "Coq native" is.

Going back to Docker, I'd try my ulimit tweak as first step, if that still creates problems I'm afraid that Docker setup should be similar to OPAM, default image should not force native compilation. In fact, there is very little reason with current coq_makefile support for native to enable native when working on a library, except in a handful of very specific cases; for the rest you just incur in a huge pointless overhead. Dune's native support will already disable native anyways for development mode unless you are producing an package for release.

Théo Zimmermann (Dec 09 2020 at 14:46):

But don't you want CI (we're talking about CI here) to test what you will ship in your release before you do actually release?

Karl Palmskog (Dec 09 2020 at 14:46):

Guillaume Melquiond (Dec 09 2020 at 14:54):

I do not know which discussion you are referring. In the CEP for native compilation, there have been 122 comments. Which of those comments are predicting anything about this Docker issue? As a matter of fact, the word Docker does not even appear in the whole discussion until the very last comment. Nobody predicted there would be an issue with the Docker image. And if you did, you forgot to voice it out.

Guillaume Melquiond (Dec 09 2020 at 14:59):

In fact, Erik even asked if someone would object to changing the Docker image. Why did you not object since you knew it would be such a mess?

Christian Doczkal (Dec 09 2020 at 15:04):

My reason for CI testing against coq.dev that I want to see compatibility problems one-by-one as they are introduced by the development process and (hopefully) find backwards compatible fixes for them one at a time. That this code will be packaged and shipped is a an afterthought at this point.

Erik Martin-Dorel (Dec 09 2020 at 15:15):

Note that the new behavior (the fact that the -native-compiler option is now propagated to coqc) only deals with Coq 8.13+beta1 and Coq dev.

As mentioned by @Guillaume Melquiond, the crux of this issue is that even if a library is fully orthogonal with native_compute, it still needs to be precompiled for native as soon as it appears as a dependency of another library that wants to benefit from native_compute.

Regarding @Enrico Tassi's suggestion:
On the one hand, distributing twice more images for each coq(>=8.13) version does not seem to scale well (since this week, there are now 4 different ocaml flavors for each coq version, each in its own image, as it has been discussed in https://github.com/coq-community/docker-coq/issues/12 and as I need to announce it today-or-tomorrow BTW).
On the other hand, assuming we did that way (having 8 images per coq version), it wouldn't be very satisfactory as a solution, because it would just amount to say: if a given library has an issue with native-compiler, the maintainer just ignores the issue rather than "fixing" it by disabling the option; so in the very end, if users do have the coq-native package installed in their switch, the issue will still be here… and will still need to be fixed "upstream".

Erik Martin-Dorel (Dec 09 2020 at 15:16):

Erik Martin-Dorel (Dec 09 2020 at 15:17):

Sorry I'm busy with grading assignments this afternoon, so if some of you have the time to post below the Discourse post (URL updated) a summary or so regarding advice for maintainers to tweak the -native-compiler flag for native-incompatible libraries, feel free to go ahead!

Erik Martin-Dorel (Dec 09 2020 at 15:24):

Erik Martin-Dorel (Dec 09 2020 at 15:30):

(and there's also @Emilio Jesús Gallego Arias's suggestion regarding ulimit that could be explored)

Jason Gross (Dec 09 2020 at 16:21):

Jason Gross (Dec 09 2020 at 16:23):

Regarding the benefits of native_compute, fiat-crypto has one example where the input program is small, the output is small, but there's a lot of computation being done, and here the native compiler shines giving at least 2x speedup over the VM IIRC. In all other cases I've encountered it's worse than the VM due to compilation or readback overhead.

Christian Doczkal (Dec 09 2020 at 17:15):

Assuming I actually want to learn about the native-compute feature, i.e. what it can and can't (yet) do. Where would I start? Also, my impression from this discussion is that while coq-native is to be "unleashed" onto everyone per default, there seems to be no reasonable expectation for it to actually work even on projects that do not mention native anywhere. (Nobody has responded that hitting a native-compute stack overflow in an a "pure .v" development is a problem that should be fixed...)

Théo Zimmermann (Dec 09 2020 at 17:16):

Théo Zimmermann (Dec 09 2020 at 17:17):

It was expected that these problems would happen, just as it is expected that coqchk will run into issues on some projects. This doesn't mean issues cannot be open and problems cannot be fixed.

Christian Doczkal (Dec 09 2020 at 17:26):

The question is, should I expect to be able to compile my "pure .v" development without tiptoeing around deficiencies of native-compute and without locking all (potential) users out of using native-compute before 8.13 is considered release ready. (Presumably, nobody would release a new version of gcc wich in the default configuration stack-overflows on 1% of all projects using it).

Pierre Roux (Dec 09 2020 at 18:28):

In both cases, no cmxs will be produced. The only difference is when your code contains native_compute calls, in the first case (no), coq will fallback to vm_compute, in the second case (ondemand), it will precompute the missing cmxs on the fly.

Pierre Roux (Dec 09 2020 at 18:36):

For what we know now : in Coq's CI, about one third of the tested projects required explicit disabling of native (so two third compile with native). It is likely that projects tested in the CI are larger than average so the figure in the wild could be much smaller.

Christian Doczkal (Dec 09 2020 at 20:18):

Does that mean that when I set -native-compiler ondemand and people want to use native_compute on things from those parts of the library that do work with the the native compiler, they can still do so?

Pierre Roux (Dec 09 2020 at 20:23):

Pierre Roux (Dec 09 2020 at 20:25):

Erik Martin-Dorel (Dec 10 2020 at 00:35):

FYI, the difference between coqc -native-compiler no and coqc -native-compiler ondemand is summarized in this array taken from the CEP.

Erik Martin-Dorel (Dec 10 2020 at 00:38):

so to sum up, if a library does not support (in a coq-native switch) a whole native precompiling:

Paolo Giarrusso (Dec 10 2020 at 02:16):

native-compiler's extra expressivity is likely to be only useful in limited scenarios

Paolo Giarrusso (Dec 10 2020 at 02:18):

and I was also a bit surprised when coq-native remained enabled on the Docker images... I was honestly hoping that now we'd get _flambda_ enabled instead, giving everybody 15-20% speedups

Paolo Giarrusso (Dec 10 2020 at 02:19):

Paolo Giarrusso (Dec 10 2020 at 02:20):

I'm happy enough building my own images with my own opam package, but I'm sure many Coq users prefer otherwise.

Paolo Giarrusso (Dec 10 2020 at 02:29):

@Théo Zimmermann I respect you deeply but I happen to disagree; native-compute should be labeled as experimental. ondemand has extremely limited use-cases, and was IMHO never a good default; native-compute=yes has specific use-cases and the recent changes should make them easier to experiment with. native-compute=no works.

Paolo Giarrusso (Dec 10 2020 at 02:31):

and to be clear, I'm interested in anything that can help performance, including native-compute! And I do hope more people will now be able to try it out!

Théo Zimmermann (Dec 10 2020 at 08:23):

All the Docker images that support flambda are indeed (already) flambda enabled and the default OCaml version is soon going to switch to 4.07, which is one of those versions.

Théo Zimmermann (Dec 10 2020 at 08:32):

@Erik Martin-Dorel It seems to me that we have received enough blacklash from users already to backtrack on the initial choice to inculde coq-native in the default image starting with 8.13

Théo Zimmermann (Dec 10 2020 at 08:33):

It was a bit unfortunate that this was not discussed in the initial CEP as well with the "experimental status" of native-compute, but nobody thought about it at the time.

Karl Palmskog (Dec 10 2020 at 09:26):

I also think @Erik Martin-Dorel should consider adding a post-installation warning of some kind for the coq-native opam package, like "Experimental: compilation of Coq files may now fail unexpectedly."

Ideally, something like the following as well: "Coq file compilation failures after installation of this package should be reported to Coq developers, not to Coq project maintainers"

Théo Zimmermann (Dec 10 2020 at 09:36):

Well, they should actually be reported to the opam-coq-archive so that native-compute can be disabled by default to those packages.

Karl Palmskog (Dec 10 2020 at 09:42):

this to me sounds like you're saying opam archive maintainers should shoulder the burden of fixing problems with native-compute. How are we supposed to do this if they are not using coq_makefile? There is no guarantee that just because we have a package, we can control arbitrary options sent to coqc.

Enrico Tassi (Dec 10 2020 at 09:51):

Can't we have 1 more image (for one compiler with has no known shortcoming with native_compute) with coq-native and have it off in all other images (to cause no breakage)?

Enrico Tassi (Dec 10 2020 at 09:53):

If we want to lint any set of packages w.r.t. a feature, I think it would be more polite to do it ourselves, running jobs with coq-native installed in our CI, not the user's one.

Karl Palmskog (Dec 10 2020 at 09:58):

what I would like to be the case w.r.t. the coq opam archive: a maintainer of a package should have to explicitly opt-in to have his package be built with native-compute. If he doesn't do this, native-compute should not be possible to use with the package for regular opam users (expert users could still override)

Karl Palmskog (Dec 10 2020 at 09:58):

this to me is the the only reasonable way to handle a feature that can arbitrarily break compilation/build

Enrico Tassi (Dec 10 2020 at 10:08):

I think @Théo Zimmermann still has a point. If a user installs both A and B, and B requires coq-native, then A is recompiled, and it may be the case that A breaks with coq-native around. It is may hence be nice to explicitly turn off native in A's package, even if A is.

We can add a job to the opam archive to test if the package happens to break with native (even if it does not require it). @Karl Palmskog if you agree I can try to add this job. If the package uses coq_makefile, a fix in _CoqProjects may be sufficient. At worse, one can declare that the package conflicts with coq-native (this works for any build system).

Karl Palmskog (Dec 10 2020 at 10:09):

@Enrico Tassi it's fine if you add an opam archive job with native-compute if this job does not affect the green/red status of CI

Enrico Tassi (Dec 10 2020 at 10:14):

I can make the job "allowed to fail", but then it would be unnoticed, unless a bot posts a message in the discussion informing about the thing and suggesting the PR author what he can do. You would still see the green light, and could decide to merge even if the package is not "clean" w.r.t. native.

Karl Palmskog (Dec 10 2020 at 10:26):

let's say that 1/3 of projects break with native-compile, this means we would have CI red 1/3 of the time when it should be green, which almost makes the green/red signal meaningless. I think this would be a big loss.

Enrico Tassi (Dec 10 2020 at 10:34):

if the job is "allowed to fail" then it does not show up in the CI box and does not contribute to "being red". It is run, and ignored.
the bot could still do something useful out of its result.

Enrico Tassi (Dec 10 2020 at 10:35):

I was proposing the bot to do something like an automatic review/comment based on the result of that specific job.

Enrico Tassi (Dec 10 2020 at 10:36):

Or even propose a change. But I don't know at which extent this can be done, Theo knows.

Enrico Tassi (Dec 10 2020 at 10:40):

Apparently there are API to put a review comment on a line (eg the bottom of the opam file). The bot could suggest to add conflicts: [ "coq-native" ] when that job fails. WDYT?

Théo Zimmermann (Dec 10 2020 at 10:46):

If this was the objective, then the changes to Coq were not needed and the CEP could have been implemented with the strategy proposed for older versions of Coq. It's pretty sad that this position was not expressed during the CEP discussion.

Théo Zimmermann (Dec 10 2020 at 10:47):

@Enrico Tassi You can already add allow failure jobs. They will be reported with the neutral check (gray square). We are using them (and it works well) for async jobs on the main repo but it is also supported in the opam-coq-archive repo.

Théo Zimmermann (Dec 10 2020 at 10:48):

I can implement whatever you want in addition (such as suggested changes) but I need a clear feature request for that (tell me exactly what you want).

Enrico Tassi (Dec 10 2020 at 10:49):

great, I will do that then.
if you give me some hints, I may give a try to patching coqbot. It's like the message for jason about the failure job, but it should be reported as review, rather than as a regular comment. I'd like to test if one can suggest "a patch".

Karl Palmskog (Dec 10 2020 at 10:58):

@Théo Zimmermann can you elaborate on where the coq opam archive is mentioned in the CEP as the place (dumping ground) to handle native compilation issues by users?

Théo Zimmermann (Dec 10 2020 at 11:00):

Nowhere, but it is written in it that native-compute will become available for all packages without any action from the maintainer, and now you are saying it shouldn't be the case. It would have saved much trouble if you had expressed this opinion earlier.

Karl Palmskog (Dec 10 2020 at 11:09):

I really don't care if native-compute becomes somehow available (it would have been available with the Coq opam compiler apprach as well). What I care about is who is going to take the fall and apply fixes when things arbitrarily fail. For example, it only now became apparent that 1/3 of the Coq CI doesn't support native-compute (and fails with an inscrutable error message like "stack overflow"). One would think this had been pertinent information when discussing the CEP

Théo Zimmermann (Dec 10 2020 at 11:12):

Disabling native-compute is supposed to be an easy step. Patching the opam archive to do it when we know a package fails is just a way of sharing effort between native-compute users. Just because you are a maintainer of the archive and you personally don't want to have to care about it doesn't mean that it's not useful or worthwhile.

Karl Palmskog (Dec 10 2020 at 11:16):

but if it is worthwhile, where is the evidence? in this thread, we have had maybe 3-4 projects report some anecdotal benefit, and we have the 1/3 failed figure from Coq CI. Given this state of knowledge, I would still recommend everyone who is a Coq project maintainer to turn it off locally in their projects

Théo Zimmermann (Dec 10 2020 at 11:20):

I'm not a user of native-compute myself, so I cannot answer this, but it looks to me like there are a number of users in this thread that have repeatedly insisted on how important this feature is. (And they can probably point out to actual evidence of this.)
That it is not working well on all projects is certainly in part due to inherent limitations but also to the lack of sufficient testing so far.

Théo Zimmermann (Dec 10 2020 at 11:21):

Finally, why again would you still recommend every maintainer to disable it by default since the whole point of the CEP was to make it available by default?! If the objective was different, then it could have been achieved differently!

Karl Palmskog (Dec 10 2020 at 11:25):

I think it's very strange that Coq project maintainers would be expected to welcome features that do not come with any clear upside, and a large possible downside.

Karl Palmskog (Dec 10 2020 at 11:35):

but with that said, this recommendation is based on the current state of knowledge, and so could change if benefits are actually shown in benchmarks, and there is no flood of error reports

Karl Palmskog (Dec 10 2020 at 11:45):

but just to be clear, I'm not really interested in seeing the exact benefits for one or two specific projects, I'm interested in seeing figures for many (100+?) relevant projects, i.e., averages.

Pierre-Marie Pédrot (Dec 10 2020 at 11:47):

Given that native_compute is a tactic you have to use explicitly, I don't see how you would magically see anything for a random project.

Théo Zimmermann (Dec 10 2020 at 12:01):

It seems that you are completely confused because you expect benefits on average like with flambda when this has nothing to do with this.

Théo Zimmermann (Dec 10 2020 at 12:02):

The benefits are only for the users who wish to rely on native-compute (and will thus install coq-native in their switch).

Théo Zimmermann (Dec 10 2020 at 12:03):

There are certainly no expected benefits on the build speed of average projects. On the contrary, it makes compilation a bit longer!

Théo Zimmermann (Dec 10 2020 at 12:04):

So yes, maybe (probably) it was a mistake to install coq-native by default in the Docker images.

Théo Zimmermann (Dec 10 2020 at 12:05):

But this is beyond the current discussion which has shifted to a complete utter nonsense.

Pierre-Marie Pédrot (Dec 10 2020 at 12:05):

Well, let's see the good side: at least now many people (including us, coq devs) are aware of the breakage introduced by native compilation.

Théo Zimmermann (Dec 10 2020 at 12:05):

Yes, that was one of my point above (until now, it was not sufficiently tested).

Karl Palmskog (Dec 10 2020 at 12:20):

I didn't confuse anything. We are talking about why or why not I recommend people to use certain options. If the benefits of including an option outweigh its costs, I recommend it. In this case, I judge that adding -arg -native-compiler -arg no to _CoqProject should be recommended, since it means you will not ever have to pay the costs of having failing native-compute builds, and your users will not complain about stack overflows when installing your projects.

Karl Palmskog (Dec 10 2020 at 12:24):

most users will not be able to tell what coq-native means, so they may install it by accident or use a Docker image where it is pre-installed (and I guess Nix Coq builds will have it as well). The option I recommend completely rules out any problems from this.

Pierre Roux (Dec 10 2020 at 12:25):

Don't forget that up to now native_compute was mostly unused because it was terribly inpractical to use (basically, everything had to be recompiled manually with specific options). So you cannot conclude to the uselessness of native_compute from that observation.

Karl Palmskog (Dec 10 2020 at 12:30):

@Pierre Roux this is not about concluding the uselessness of coq-native. It's about the situation for regular Coq project maintainers - I'm arguing that for most of them the new coq-native infrastructure will be a tax imposed from above without any personal benefit. Most will have no idea what coq-native is all about, yet they could see their projects fail in Docker and users claim the project fails compilation. The _CoqProject option will allow them to avoid all these potential problems.

Pierre Roux (Dec 10 2020 at 12:34):

But then, they may experience complains from coq-native users that they disabled it, possibly for no reason. There seems to be no perfect solution.

Karl Palmskog (Dec 10 2020 at 12:36):

it seems to me that coq-native users are generally savvy and highly sophisticated and can figure out workarounds, so it seems far more likely that user complaints will come about coq-native failures rather than absence of coq-native.

Karl Palmskog (Dec 10 2020 at 12:37):

Pierre Roux (Dec 10 2020 at 12:39):

You may also consider that, looking at Coq CI, the projects that seems to fail with native compute are not the random proof project but projects doing complicated things, with for instance very large non opaque terms, those users woul probably also know what they are doing and know at least a bit about native compute. Again, the average user doing regular proofs is very likely not to be impacted (but its proof could be of use for others).

Karl Palmskog (Dec 10 2020 at 12:44):

on average, it seems unlikely that the "transitive coq-native compilation benefit" will be realized for a project. The opam coq archive dependency graph is quite shallow - not many projects are used as dependencies. Moreover, I think this is like saying you should take one for the team - I should allow this failure vector so someone else I don't know about will gain.

Pierre Roux (Dec 10 2020 at 12:47):

We already had examples so it is definitely possible, how likely is impossible to know today, as already said. "unlikely" is only your personal opinion.

Erik Martin-Dorel (Dec 10 2020 at 12:57):

Hi all, I've just seen your comments − I guess the usefulness of the coq-native packaging was discussed in detail in the CEP, and for both use cases:

But after installing the package by default in the Docker packaging, I wasn't expecting to get as much backlash, following these failures with the Coq.8.13+beta1 or coq.dev builds… Anyway I guess what suggested Théo and Enrico looks a very sensible trade-off, very easy to setup and to document:

So I plan to start rebuild these images ≥ 8.13 from now on, but could you please comment on which strategy you'd prefer for older images?
(A) do the same for older versions (8.7 ≤ Coq ≤ 8.12) as for Coq 8.13+ (also leading to five different images, one with OCaml 4.07.1 + coq-native (albeit the "automatic propagation of the -native-compiler yes flag" especially concerns Coq 8.13+))
(B) keep the current four images as is, i.e., with coq-native, with the stdlib native-precompiled only, but allowing to native-precompile selected libraries later on, when the item 3 of the CEP is gradually taken into account.

Karl Palmskog (Dec 10 2020 at 13:00):

the most depended-on package in the opam archive is coq-mathcomp-ssreflect. Yet, it is only depended on by 32 packages. coq-ext-lib, a general library, is only used by 6 other packages. The upper bound on number of dependencies (and thus potential for transitive coq-native benefits) look terrible compared to the Java ecosystem, where packages regularly depend on 40+ packages (see this paper: https://www.cs.cornell.edu/~legunsen/pubs/GyoriETAL18EcosystemRTS.pdf)

Enrico Tassi (Dec 10 2020 at 13:00):

Karl Palmskog (Dec 10 2020 at 13:03):

Paolo Giarrusso (Dec 10 2020 at 13:22):

+1 for (A): I had no crystal ball, but I also expected native-compute to be 100% opt-in. native-compute support for package foo should probably be a feature request from users of foo who are actually interested.

Théo Zimmermann (Dec 10 2020 at 13:30):

Really? Then once again, the part about changing the default value of the coqc -native-compiler option in Coq would not have been necessary. You are the second participant to this CEP to express this opinion now.

Théo Zimmermann (Dec 10 2020 at 13:30):

BTW, option (B) is the most backward compatible for older Coq versions: since the stdlib used to be pre-compiled.

Enrico Tassi (Dec 10 2020 at 13:35):

How can it be less backward compatible? If you have job for 8.9, then A says that that job will continue to download the same image as before, while B does not imply that. IIUC

Erik Martin-Dorel (Dec 10 2020 at 13:37):

Théo is right regarding backward compatibility (apart from macOS users). See the opam-repository PR that implements the CEP.
I will repost the URL.

Enrico Tassi (Dec 10 2020 at 13:38):

OK, maybe I'm lost, no problem. Then I vote for "the option that has the least chance of breaking things for users not opting in".

Erik Martin-Dorel (Dec 10 2020 at 13:48):

    "-native-compiler" {os = "macos"} "no" {os = "macos"}

so beforehand, native-compiler was unconditionally disabled for macos (which was maybe a too abrupt choice BTW, but this is another story),
and enabled otherwise (just for the stdlib for Coq < 8.13).

     "-native-compiler" "yes" {coq-native:installed} "no" {!coq-native:installed}

so if coq-native is not installed, the stdlib won't be precompiled for Coq < 8.13.
Thus strategy (A) in my previous comment would indeed lead to a non-backward-compatible change for the Docker images of these old Coq versions.

Personally, I'm very OK with both strategies (A) and (B).
But maybe if we have not a very strong opinion on this last choice, we could also defer this choice for, say, one week (keeping the strategy (B) for the moment and only remove coq-native from Coq 8.13+), then do the announcement that I was planning for the single-switch Docker migration, and maybe applying (A) next week depending on the feedback we might get in the meantime? WDYT?

Emilio Jesús Gallego Arias (Dec 10 2020 at 13:49):

Non-locality of the depext setup was indeed discussed in the CEP, and it is a problem. Unfortunately if you disable native in a package locally you get a broken switch in the moment some other package depends on it. The only solution I can see with opam is to actually add a conflicts: coq-native constraint.

Note that all the breakage we are seeing is due to either a) lack of memory or b) small default stack size, [which can be fixed with ulimit], so indeed I would expect coq-native users to bump ulimit by quite a bit if they want to use it. I'd conjecture that for a large enough ulimit the whole archive should compile.

Note that native + flambda will give nice speedups but will eat even more stack, so IMHO it is reasonable to keep that in mind. coqc tries to workaround this by unconditionally passing -O classic to OCaml, this effectively removes many benefits from flambda but provides a safer setup. Super-specialized users would like to tweak Coq so -O 3 is passed instead, as they could likely reap on the benefits.

By the way flambda provides speedups on the range of 2%-18% perfect, just to mention that not everyone will greatly benefit from it, but indeed quite a few users see a speedup.

Erik Martin-Dorel (Dec 10 2020 at 13:51):

Hi Emilio, just to be sure, could you elaborate a bit on what you mean by "Non-locality of the depext setup" ?

Emilio Jesús Gallego Arias (Dec 10 2020 at 13:58):

Non-locality means that a package A may pull coq-native , forcing recompilation of unrelated package B with different options that B is not aware.

Erik Martin-Dorel (Dec 10 2020 at 14:02):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:02):

Anyways I'd be pragmatic here as I think we'll have a better setup for 8.14; as @Erik Martin-Dorel is discovering touching anything related to build setup is a hairy issue as I've lived on my on flesh :) sending courage as this is hard and exhausting work :)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:02):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

This way coq-native seems pretty opt-in as not to cause a lot of harm in non-expert users, and we would have a good understanding for the future setup with split packages: basically, packages that have the conflicts would not be a candidate to provide coq-pkg-native in the split setup.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

The split setup would signal a bit better what packages cannot be compiled yet with native.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

Erik Martin-Dorel (Dec 10 2020 at 14:05):

but if the split-native approach is ready for 8.14, note anyway that the coq-opam-archive packaging maintenance overhead that is induced by that approach should be solved beforehand…

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:05):

coq_makefile should not be a lot of problem to handle, but indeed the split setup will still have some hurdle as maintainers will have to enable support explicitly, but I much prefer that as IMHO it is reasonable that package maintainers do test native before enabling it in their packages.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:06):

Not sure I fully understand, please feel free to correct me. Ideally, the migration to a split setup would be opt-in

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

in the sense that before coq-native , after you implement the split setup, no package will provide a native setup

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:08):

the thing is that indeed I think that automatically having all the packages compile with native seems a bit too risky for me

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:08):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:09):

but indeed how to make the life of packagers easier I dunno, for dune that will be automatic, but other people with opam, I guess they should update from a template? [That's how it works today right?]

Erik Martin-Dorel (Dec 10 2020 at 14:09):

I also think that the split-native approach is very nice, but from the opam-coq-archive perspective, this adds many packages to maintain!
(Karl already mentioned that tweaking the existing opam specs for non-native-compatible packages might be a concern, so it will be a yet more complex task if all native-compatible packages have to be duplicated for the split-native approach!)
This could be solved IMHO just if the generation of the coq-lib-native packages can be automated.

Enrico Tassi (Dec 10 2020 at 14:09):

I'm trying to add a job to the opam archive so that we can test native when people submit packages https://github.com/coq/opam-coq-archive/pull/1504
can someone remind me of a package that fails? (to test the infrastructure)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:11):

Yes indeed, it should be automated. I don't think it would be very hard to update the opam templates to do so, but I am not very familiar. For the dune-based packages they'll know what to do.

Erik Martin-Dorel (Dec 10 2020 at 14:13):

I guess there is one of the packages mentioned by @Christian Doczkal?
(but I've just seen you already ping'ed him on GitHub :)

Erik Martin-Dorel (Dec 10 2020 at 14:20):

@Emilio Jesús Gallego Arias Regarding the flambda topic and to follow-up a remark by @Paolo Giarrusso, it is true that it's only partly supported in docker-coq (and there is an opened issue here: https://github.com/coq-community/docker-coq/issues/10): we could indeed pass the dedicated option -flambda-opts '-O3 -unbox-closures' to coq's configure,
but I believe setting these options is not a specific issue to docker-coq but to the opam packaging, and could be promoted as a default… (as on non-flambda swiches, this option would be a no-op!)

Actually note that I had first proposed to do the two changes (coq-native and flambda-opts) in one go in the following PR:
https://github.com/ocaml/opam-repository/pull/16887 (now-closed); but I decided to look after these two enhancements one at a time, all the more as Théo had suggested to open the coq-native CEP to foster the discussion…

Erik Martin-Dorel (Dec 10 2020 at 14:23):

So regarding -flambda-opts do you confirm @Emilio Jesús Gallego Arias that proposing another PR in opam-repository soon doing this, would be the way to go?
cf. https://github.com/ocaml/opam-repository/pull/16887/files#diff-655d9f2dcb2ffb68e9326406cdc3a01a43ba35d4e496a87f624f2063e61ab7dd
finally I know that @Paolo Giarrusso and you have done many benchmarks on "with or without flambda"; but would you advise me to have a look in a particular place if you gathered some of your benchmark infrastructure on that flambda topic, in a given repo? anyway, getting as Emilio mentioned, a speedup of "up to 18%" in the default setting would be great…

Paolo Giarrusso (Dec 10 2020 at 14:26):

@Erik Martin-Dorel thanks for reminding me; enabling flambda in opam packages was indeed the plan, which the coq-native change now unblocks.

Paolo Giarrusso (Dec 10 2020 at 14:27):

FWIW we’ve been saying no to native and yes to flambda for around 6 months throughout our company and CI with 4.07.1-flambda

Paolo Giarrusso (Dec 10 2020 at 14:28):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

These flags are already the default on Coq dune builds, and indeed make sense to pass always, I believed it was the case actually.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

Note that these flags are for compilation of Coq itself, and they have no relation to OCaml flags used in native-compute

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:31):

But yeah, flambda is a big pain to use, and with native we have a bit of the same setup now so we will find the same problems as noted in the CEP

Erik Martin-Dorel (Dec 10 2020 at 14:32):

yes Emilio, I know that the flambda flags are orthogonal to native-compute (modulo the issue you recall in your last comment); my only question was about the need to update the coq packaging in opam-repository (which does not rely yet on dune)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:33):

Erik Martin-Dorel (Dec 10 2020 at 14:33):

Paolo Giarrusso (Dec 10 2020 at 14:33):

No, because despite "no relation", flambda + native is measurably slower than the default for some reason.

Paolo Giarrusso (Dec 10 2020 at 14:34):

We've been working around the relevant issues for ~one year, the goal here is to work around them better :-)

Erik Martin-Dorel (Dec 10 2020 at 14:35):

Paolo Giarrusso (Dec 10 2020 at 14:35):

Erik Martin-Dorel (Dec 10 2020 at 14:36):

because if we focus on native_compute, you know there are two times that can be "benched" (the compilation time and the running time)

Paolo Giarrusso (Dec 10 2020 at 14:36):

ah; I have some benchmark data on github coq issues, I can dig into it again if we adjusted the question.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:36):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:37):

Paolo Giarrusso (Dec 10 2020 at 14:37):

Good point Erik, my benchmark were just about "I never write native_compute, why do I pay for it"

Paolo Giarrusso (Dec 10 2020 at 14:38):

(or rather, I just enabled flambda, got slowdowns, and investigated. That's the yak we're still shaving)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:38):

or of course if I was doing some native_compute proofs, and Dune will explicitly disable it when in dev mode

Erik Martin-Dorel (Dec 10 2020 at 14:40):

So given the current state of the discussion, I guess we could just do this in the coq opam-packaging?

pass the -flambda-opts "-O3 -unbox-closures" option only if OCaml >= 4.07 and coq-native is not installed…

(having ideally up-to-date benchmarks for Coq 8.13+ and Coq < 8.13 that confim again this is the way to go :)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:41):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:42):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

Paolo Giarrusso (Dec 10 2020 at 14:43):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

Erik Martin-Dorel (Dec 10 2020 at 14:43):

I see what you mean. That's also the reason why there is no ocaml < 4.07 with flambda in docker-coq

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:44):

the cost of coq-native + flambda is quite small when compared to the cost of coq-native alone

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

but still not insignificant, in particular there are some problems with flambda middle-end that impact Coq

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

Paolo Giarrusso (Dec 10 2020 at 14:45):

@Erik Martin-Dorel in any case, this is the other reason we’d want +flambda-native images as default: best performance

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:46):

so I don't care too much, it is like dune will only build OCaml byte objects when you call @check

Erik Martin-Dorel (Dec 10 2020 at 14:46):

Paolo Giarrusso (Dec 10 2020 at 14:46):

Erik Martin-Dorel (Dec 10 2020 at 14:46):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:46):

regarding opam indeed you have no control over what compiler the user installs, so coq-native should conflict with broken compilers

Enrico Tassi (Dec 10 2020 at 14:47):

I looked at the logs of coq-graph-theory: it fails with 4.05, but with 4.07 it apparently works, so the failure is not (only) related to coq-native.

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:47):

with coq makefile you could hack this by having a make dev call, that would explictly add -native off

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:47):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:48):

Enrico Tassi (Dec 10 2020 at 14:48):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:49):

Enrico Tassi (Dec 10 2020 at 14:49):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:49):

@Enrico Tassi have a look at Coq's gitlab file, we already bump ulimit there otherwise flambda wouldn't work

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

@Enrico Tassi failure is relative here as it usually comes from a) not enough memory b) low ulimit

Erik Martin-Dorel (Dec 10 2020 at 14:50):

@Emilio Jesús Gallego Arias to sum up what you said, we should mark coq-native conflicts with +flambda < 4.07 (but keep it available for 4.05 I guess)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

Enrico Tassi (Dec 10 2020 at 14:50):

This may be a suggestion for the docker images or the actions, we can't ask users of coq community templates to bump the stack

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:51):

@Erik Martin-Dorel yes, in principle it should be available in 4.05, however if you ask me, and based on my memories of discussions about OCaml bugs, I'd just require 4.07.1 for coq-native as the minimum version

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:51):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

Erik Martin-Dorel (Dec 10 2020 at 14:52):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

stack is a big problem for users, as even different distros carry different defaults :S

Enrico Tassi (Dec 10 2020 at 14:52):

Maybe we could be "bold" and say that Coq 8.13 recommends 4.07? (and drop 4.05 images, and co)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

4.05 images without coq native could be very useful to test against lower bounds

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

Enrico Tassi (Dec 10 2020 at 14:53):

have the configure script accept 4.05..4.07, but warn, and be happy with >= 4.07 ?

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

Erik Martin-Dorel (Dec 10 2020 at 14:54):

but if we disable coq-native for 4.05.0, we won't be able to have native_compute for coq 8.5 and Coq 8.6 anymore:

* Coq dev: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.13: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.12: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.11: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.10: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.9: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.8: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.7: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.6: 4.02
* Coq 8.5: 4.02
* Coq 8.4: 4.02

Erik Martin-Dorel (Dec 10 2020 at 14:54):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

Yeah if you want to support that versions it's fine, just be aware that you will be testing previously untested stuff

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

Erik Martin-Dorel (Dec 10 2020 at 14:54):

I don't think so: as said previously, Coq 8.5 and Coq 8.6 were already precompiled with native

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:55):

Erik Martin-Dorel (Dec 10 2020 at 14:55):

I know, but this is precisely the question @Théo Zimmermann raised, regarding backward compatibility

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:55):

Anyways we should put some pressure to OCaml devs so they fix the bugs that are making OCaml work bad in our use case, or fix our compilation scheme :)

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:56):

Actually fixing these stack overflows in the OCaml compiler is tedious, but not so complex

Paolo Giarrusso (Dec 10 2020 at 14:56):

(a) is 4.07.0 tested? Emilio always says 4.07.1 (b) how much should we be evolving 8.5 and 8.6?

Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

Enrico Tassi (Dec 10 2020 at 15:03):

I added to the next Coq Call the item: "Coq 8.13 require 4.07". From this discussion, we don't lose much by dropping 4.05.

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:03):

Enrico Tassi (Dec 10 2020 at 15:04):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:04):

Enrico Tassi (Dec 10 2020 at 15:04):

The problem by @Christian Doczkal is that the default image uses 4.05 + native. If you are on 4.07 his package works even with native

Erik Martin-Dorel (Dec 10 2020 at 15:05):

So to put all things together, regarding the docker-coq flambda/native packaging, let me recap the plan:

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

Enrico Tassi (Dec 10 2020 at 15:05):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

if Docker maintainers chose to use a untested setup as the default, what can we do, 4.07.1 is only tested with native in my own workstation, so that's not a so great improvement

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:06):

It may work, it may not, for example it could work on 4.07 and fail in 4.08 as the compiler did regress on some internal functions AFAICT

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

Enrico Tassi (Dec 10 2020 at 15:07):

OK, I agree we should test things. But we should decide on what. I'm adding a CI job on the opam archive so that we can detect packages that don't compile with coq-native. Which compiler should I pick?

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

Enrico Tassi (Dec 10 2020 at 15:07):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:08):

Enrico Tassi (Dec 10 2020 at 15:08):

so, to my understanding, we know that 4.05 has many problems. Hence my proposal to ditch it.

Erik Martin-Dorel (Dec 10 2020 at 15:08):

can you be more specific? was it related to my latest comment recapitulating the plan?

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:08):

Erik Martin-Dorel (Dec 10 2020 at 15:08):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:09):

@Enrico Tassi for example 4.08.0 may have [and indeed I think it does] the same problem of excessive memory use

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:09):

Enrico Tassi (Dec 10 2020 at 15:09):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:10):

Enrico Tassi (Dec 10 2020 at 15:10):

Erik Martin-Dorel (Dec 10 2020 at 15:10):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:10):

and OCaml upstream answer was "we don't care, that's not an use case we support"

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:11):

Seems a bit weak criteria, in the sense that OCaml version A requires M amount of memory to compile a quite random file, then OCaml version B require M+1 memory to compile that random file, we ditch A ?

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:12):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:13):

My view is simple, and in agreement with OCaml devs position on this one: if you wanna use native, get ready to increase memory limits and total memory .

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:13):

Enrico Tassi (Dec 10 2020 at 15:16):

This is not my point. There are Coq users that don't care which version of OCaml they use, they just want Coq to work. For them, we should chose a good version, and test that version as much as we can. If you wan to compile Coq with a version of OCaml of your choice you can, but in the docker images we provide for testing, and in the binary installers we provide, and in the installation scripts we provide (the platform) we should pick sensible versions.

Karl Palmskog (Dec 10 2020 at 15:17):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:17):

I agree, but how this is related to us supporting some versions? We support 4.08 but we don't recommend it.

Enrico Tassi (Dec 10 2020 at 15:18):

./configure could accept but warn. And starting from that we could stop providing images for versions we know don't work well for us to avoid users discover it themselves.

Karl Palmskog (Dec 10 2020 at 15:19):

the obvious drumbeat for OCaml version choice is the Coq Platform. Michael settled on 4.07.1 for good reasons.

Enrico Tassi (Dec 10 2020 at 15:19):

I personally was not aware of so many (performance/memory) differences in these ocaml compilers. I can only imagine Coq users which don't even know how to code in OCaml.

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:20):

The example you provide is pretty fragile, you cannot generalize on memory behavior just from one particular file. There are files where 4.05.0 will use less memory. This whole thing is very misguided IMHO.

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:20):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

Enrico Tassi (Dec 10 2020 at 15:21):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:22):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:24):

I think PMP and I were pretty clear [and we have been involved in the upstream discussion]

It is great we can try to test a large amount of devs in the CI , I fully support that, and I supported the CEP in the end because even if far from perfect we need to make progress. But unfortunately , unless there is significant work on the OCaml compiler or in our compilation strategy, coq-native will still be quite expensive for the regular user.

Also, we have a way more critical bug which is that the compilation scheme may stop working with newer OCaml versions.

Enrico Tassi (Dec 10 2020 at 15:30):

I think it was clear that there were problems. It was not clear to me that 1) docker images would have opt-in by default 2) problems were varying depending on the compiler version and custom settings (like the stack).

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:44):

Yup, that's clear, would be great if we could understand which version is best for native

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:44):

Emilio Jesús Gallego Arias (Dec 10 2020 at 15:46):

Christian Doczkal (Dec 10 2020 at 20:08):

Théo Zimmermann (Dec 10 2020 at 20:09):

Erik Martin-Dorel (Dec 10 2020 at 20:34):

Hi @Christian Doczkal that's explained because I had only propagated the removal of coq-native in coqorg/coq:dev and mathcomp/mathcomp:latest-coq-dev (manual rebuild), but not mathcomp/mathcomp-dev:coq-dev yet, sorry… I'll take care of this just now. Thanks for your comment!
(Note: this can be checked thanks to this line:
https://github.com/coq-community/graph-theory/runs/1532521296#step:4:295 )

Christian Doczkal (Dec 10 2020 at 20:37):

@Erik Martin-Dorel , the point was that @Enrico Tassi had claimed that the development builds correctly, even with coq-native enabled, on ocaml-4.07.

Erik Martin-Dorel (Dec 10 2020 at 20:38):

Yes but IIUC, the only difference between the build that fails or not seems related to the default stack size, which could differ in opam-coq-archive CI, and GitHub Container Actions…

Erik Martin-Dorel (Dec 10 2020 at 20:39):

As recalled by Emilio with the issue he mentioned from the ocaml bug tracker, this is a typically known bug related to the use of ocamlopt w.r.t. large codes, automatically generated or so.

Erik Martin-Dorel (Dec 10 2020 at 20:46):

@Emilio Jesús Gallego Arias BTW do you think it would be useful (without modifying anything for now, but at least for informative purpose) to add the following command somewhere in the default CI scripts?

ulimit -a

Erik Martin-Dorel (Dec 10 2020 at 20:59):

Enrico Tassi (Dec 10 2020 at 22:38):

I guess the number of variables made it hard to identify the culprit with reasonable certainty. I think we should try to find a way to reduce that, at least the number of variables visible by users of standard stuff.

Emilio Jesús Gallego Arias (Dec 10 2020 at 22:53):

@Erik Martin-Dorel sure, we didn't care so much as of today given that SO where rare, but it cannot hurt. We could even do a check in the native compiler for current stack size, but these issues are kinda tricky in general as there is no agreed upon recipe w.r.t. what the OCaml compiler needs.

Paolo Giarrusso (Dec 11 2020 at 04:30):

good question, and I've thought about it. Back then, the plan was that native-compute would be opt-in, but we meant _for users_: this was an experimental feature that impacted you even when you didn't use it. We estimated _some_ pain for maintainers, but it only was creating a second package, and the goal of the discussion was limiting that pain.

Paolo Giarrusso (Dec 11 2020 at 04:32):

my main goal in the design was to get native-compute completely disabled for everybody, since today that's still the correct default (IMNSHO)

Paolo Giarrusso (Dec 11 2020 at 04:36):

However, I/we anticipated this would make it easier to _enable_ native-compute for interested people, so more people could try it out.

Paolo Giarrusso (Dec 11 2020 at 04:42):

at least I did not anticipate anything like this, but it should absolutely not be surprising that a not-fully-tested feature is not production-quality when tested at scale on millions of lines of code.

Paolo Giarrusso (Dec 11 2020 at 04:45):

TLDR: I suspect this pushback, while unfortunate, is in fact a _success_ of the CEP: it's much easier to try out native-compute and discover if it helps.

Paolo Giarrusso (Dec 11 2020 at 04:48):

I think it's a cool feature, and it's cool "the first 90%" (or the POPL 90%) of the work is in, but the engineering/integration is not trivial, and it has been undersestimated for long. Hopefully now it'll be easier to find and address the bugs that pop up, and exploit it where it already works. It might even count as motivation.

Christian Doczkal (Dec 11 2020 at 09:57):

Coming to think of it, not testing (and hence not failing due to) coq-native in the CI images may actually help people who want to use native compilation: It means that library maintainers will not be encouraged to disable native compilation and then forget about it, locking their users out of native compilation. And people who really want to use native compute with some library can install coq-native and then tinker around with specific versions of OCaml, custom ulimit, and whatnot. So thanks @Erik Martin-Dorel for the the quick reaction. CI is green again :smile:

Pierre Roux (Dec 11 2020 at 10:31):

Do we need to do something about the last point? I see a few potential improvements here:

Pierre Roux (Dec 11 2020 at 10:35):

About the last point, a potential implementation would be to have the CI of the OPam archive complain if any package doesn't contain either a conflict clause with coq-native or some meta-information telling the package is somewhat expected to be native-able. The CI could suggest to put the conflict clause if the maintainer doesn't know/want to bother about native. It seems that something similar will be needed anyway for the split approach to know which *-native packages to generate or not.

Enrico Tassi (Dec 11 2020 at 10:44):

My plan to mitigate 3 is https://github.com/coq/opam-coq-archive/pull/1504, that is to know in advance which packages are problematic for coq-native and tell their upstream authors to either make these package conflict with coq-native or put, upstream, a native-compiler off option.

Of course it would make sense to coordinate the tests we run on the opam archive with the docker images and the platform installers, so that we ship/advertise what we actually test. I did put an item about this at the next Coq call.

Christian Doczkal (Dec 11 2020 at 10:45):

I am against anything that would require package maintainers to deal with coq-native or make any decisions about it. Package maintainers can choose to do so, in order to forestall complaints, but as long as users who install coq-native are informed of the experimental nature of their setup, I see no point in bothering package maintainers, in particular given that changing the ocaml version and various settings may make a library work with coq-native even if it doesn't work reliably with different settings.

Enrico Tassi (Dec 11 2020 at 10:49):

The plan (to be executed) is to have a bot make a review that suggests a little patch + conflicts: [ "coq-native" ] when the package you upload is found to not work with coq-native. I'm aware that "work" is not very well defined, hence the coordination I hope to obtain.

Is this plan "too much" from the perspective of a package maintainer publishing on the coq opam archive?

Enrico Tassi (Dec 11 2020 at 10:49):

I'm of course not against warning users that install coq-native... on the contrary

Christian Doczkal (Dec 11 2020 at 10:58):

Well, I'm relatively relaxed about the bot suggesting a conflicts clause if the library fails to build with coq-native on all versions of ocaml (or a representative subset suggested by the coq developers) with settings that already try to make things work (i.e., by increasing the stack size). What's less clear to me is whether one should add a conflict clause if the package fails to build on some and succeeds to build on other compilers. In any case, it's good if maintainers won't have to change anything in their repositories, so that compatibility with coq-native is always tested when releasing to the archive. Also, would suggest to only add conflicts for the released repository. In summary, I suggest the meaning of conflicts: [ "coq-native"] to be is known to (relibably) fail rather than is not known to reliably work.

Enrico Tassi (Dec 11 2020 at 11:22):

Let me add that one really willing to use native with a package that conflicts with coq-native can tell opam to ignore that constraint.
So I'm not super convinced of the meaning you suggest, but well, without deciding what we consider as "supported platforms" it not really obvious to give a meaning to "reliable" either.

Btw, did someone figure out if docker has a constrained stack? (on my PC it's 8K, and it has been like that since I could remember)

Théo Zimmermann (Dec 11 2020 at 11:24):

It might be due to the GitHub Actions default settings. I haven't looked into that.

Théo Zimmermann (Dec 11 2020 at 11:25):

Christian Doczkal (Dec 11 2020 at 11:49):

@Enrico Tassi As I said, as long as maintainers are not forced to disable (once and for all) native compilation by putting some option into their _CoqProject file, I'm okay.

Karl Palmskog (Dec 11 2020 at 12:11):

@Théo Zimmermann how will native-compile be handled in Nix and the Nix CI? if native conflicts is handled at the opam level, Nix packages will require something separate but analogous, right?

Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

Not sure the conflict is a good idea per-se, in the sense that 99% of failures of packages with coq-native are going to be due to low stack size.

Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

So IMHO we need something that makes people installing coq-native to be aware that they needs lots of RAM and stack

Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

Emilio Jesús Gallego Arias (Dec 11 2020 at 12:56):

This is not so different from some packages already requiring huge amounts of RAM even without native

Théo Zimmermann (Dec 11 2020 at 12:56):

@Karl Palmskog In Nix, we have never set the -native-compiler flag explicitly. Since ondemand is the new default, it means that we preserve backward-compatibility. If there are users that want something similar to coq-native, it will have to be implemented (it's not the case today).

Enrico Tassi (Dec 11 2020 at 13:02):

Enrico Tassi (Dec 11 2020 at 13:04):

Pierre-Marie Pédrot (Dec 11 2020 at 13:05):

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:08):

There is no clear definition of "a lot", as it depends on the internals of the OCaml compiler, so it is not even stable by versions. For example some code will stress the register allocation, some other will stress the constant propagation. Large modules tend to be a problem too. I'm not sure why Unimath should be bad for coq-native, likely the opposite.

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:08):

But indeed code with lots of modules for example tend to make the OCaml compiler go hungry.

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:09):

But that could change next version; anyways I'm not trying to figure out the heuristic, but my point is that packages that don't work, they fail because of the memory setup, so that is not an intrinsic conflict, but more of a circumstantial one.

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:09):

Pierre-Marie Pédrot (Dec 11 2020 at 13:15):

Pierre-Marie Pédrot (Dec 11 2020 at 13:17):

I can't find it again but there was a bench with coq-native on, and unimath was the worst slowdown IIRC.

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:17):

tho, I'm gonna finish the coqnative patch for Dune, this way Dune will track ocamlc invocations including memory and time

Emilio Jesús Gallego Arias (Dec 11 2020 at 13:18):

Pierre-Marie Pédrot (Dec 11 2020 at 13:18):

UniMath is a very good outlier on benches because it's literally the one development that consistently goes against all recommended good practices in Coq dev.

Enrico Tassi (Dec 11 2020 at 19:59):

In the past 2 days I've been trying to reproduce these native failures users are experienciong and we could foresee... and I'm failing miserably (which is a good news to me, but I'm frustrated).

I tried to run Coq master on 4.07 and 4.09 with and without flambda with coq-native installed on unimath and fiat-parsers... and it works on a worker with 4G ram and 8K stack. What am I doing wrong? (I'm not using docker, which may be a relevant, and I did not test 4.05).
https://github.com/coq/opam-coq-archive/pull/1504

Jason Gross (Dec 16 2020 at 16:07):

I've since disabled the native compiler on fiat-parsers, I can point you at a commit before that if you like.

Stream: Coq devs & plugin devs

Topic: recent changes to the native compiler?

Jason Gross (Dec 07 2020 at 21:37):

Gaëtan Gilbert (Dec 07 2020 at 22:34):

Jason Gross (Dec 08 2020 at 01:08):

Christian Doczkal (Dec 08 2020 at 18:48):

Théo Zimmermann (Dec 09 2020 at 09:50):

Théo Zimmermann (Dec 09 2020 at 09:50):

Théo Zimmermann (Dec 09 2020 at 09:51):

Théo Zimmermann (Dec 09 2020 at 09:52):

Karl Palmskog (Dec 09 2020 at 11:20):

Karl Palmskog (Dec 09 2020 at 11:21):

Karl Palmskog (Dec 09 2020 at 11:22):

Guillaume Melquiond (Dec 09 2020 at 12:05):

Guillaume Melquiond (Dec 09 2020 at 12:06):

Pierre Roux (Dec 09 2020 at 12:27):

Pierre Roux (Dec 09 2020 at 12:29):

Karl Palmskog (Dec 09 2020 at 12:37):

Karl Palmskog (Dec 09 2020 at 12:39):

Pierre Roux (Dec 09 2020 at 12:43):

Pierre Roux (Dec 09 2020 at 12:43):

Karl Palmskog (Dec 09 2020 at 12:44):

Pierre Roux (Dec 09 2020 at 12:44):

Karl Palmskog (Dec 09 2020 at 12:45):

Pierre Roux (Dec 09 2020 at 12:45):

Karl Palmskog (Dec 09 2020 at 13:10):

Pierre-Marie Pédrot (Dec 09 2020 at 13:12):

Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

Pierre Roux (Dec 09 2020 at 13:13):

Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

Pierre-Marie Pédrot (Dec 09 2020 at 13:14):

Karl Palmskog (Dec 09 2020 at 13:16):

Pierre-Marie Pédrot (Dec 09 2020 at 13:20):

Pierre-Marie Pédrot (Dec 09 2020 at 13:21):

Karl Palmskog (Dec 09 2020 at 13:23):

Théo Zimmermann (Dec 09 2020 at 13:30):

Théo Zimmermann (Dec 09 2020 at 13:31):

Théo Zimmermann (Dec 09 2020 at 13:31):

Karl Palmskog (Dec 09 2020 at 13:32):

Karl Palmskog (Dec 09 2020 at 13:32):

Théo Zimmermann (Dec 09 2020 at 13:35):

Théo Zimmermann (Dec 09 2020 at 13:36):

Karl Palmskog (Dec 09 2020 at 13:36):

Karl Palmskog (Dec 09 2020 at 13:36):

Karl Palmskog (Dec 09 2020 at 13:38):

Pierre-Marie Pédrot (Dec 09 2020 at 13:42):

Emilio Jesús Gallego Arias (Dec 09 2020 at 13:58):

Emilio Jesús Gallego Arias (Dec 09 2020 at 13:59):

Guillaume Melquiond (Dec 09 2020 at 14:03):

Guillaume Melquiond (Dec 09 2020 at 14:06):

Christian Doczkal (Dec 09 2020 at 14:14):

Théo Zimmermann (Dec 09 2020 at 14:15):

Théo Zimmermann (Dec 09 2020 at 14:15):

Théo Zimmermann (Dec 09 2020 at 14:16):

Théo Zimmermann (Dec 09 2020 at 14:17):

Guillaume Melquiond (Dec 09 2020 at 14:18):

Théo Zimmermann (Dec 09 2020 at 14:21):

Karl Palmskog (Dec 09 2020 at 14:21):

Karl Palmskog (Dec 09 2020 at 14:23):

Enrico Tassi (Dec 09 2020 at 14:24):

Karl Palmskog (Dec 09 2020 at 14:25):

Enrico Tassi (Dec 09 2020 at 14:25):

Christian Doczkal (Dec 09 2020 at 14:26):

Guillaume Melquiond (Dec 09 2020 at 14:28):

Karl Palmskog (Dec 09 2020 at 14:29):

Christian Doczkal (Dec 09 2020 at 14:30):

Karl Palmskog (Dec 09 2020 at 14:31):

Karl Palmskog (Dec 09 2020 at 14:32):

Guillaume Melquiond (Dec 09 2020 at 14:33):

Emilio Jesús Gallego Arias (Dec 09 2020 at 14:41):

Théo Zimmermann (Dec 09 2020 at 14:46):

Karl Palmskog (Dec 09 2020 at 14:46):

Guillaume Melquiond (Dec 09 2020 at 14:54):

Guillaume Melquiond (Dec 09 2020 at 14:59):

Christian Doczkal (Dec 09 2020 at 15:04):

Erik Martin-Dorel (Dec 09 2020 at 15:15):

Erik Martin-Dorel (Dec 09 2020 at 15:16):

Erik Martin-Dorel (Dec 09 2020 at 15:17):

Erik Martin-Dorel (Dec 09 2020 at 15:24):

Erik Martin-Dorel (Dec 09 2020 at 15:30):