Stream: Coq devs & plugin devs

Topic: recent changes to the native compiler?


view this post on Zulip Jason Gross (Dec 07 2020 at 21:37):

fiat-parsers stack overflows on compiling files like src/Parsers/GenericRecognizerOptimized because the ocaml native compiler stack overflows when generating the native files for the .vo file. Was there a recent change to Coq master (within the past month or so) which would explain this change? (For example, does it force early native compilation by default now where it used to not?)

view this post on Zulip Gaëtan Gilbert (Dec 07 2020 at 22:34):

https://github.com/coq/coq/pull/13352 ?

view this post on Zulip Jason Gross (Dec 08 2020 at 01:08):

Seems likely, thanks!

view this post on Zulip Christian Doczkal (Dec 08 2020 at 18:48):

I may be hitting the same issue trying to compile graph-theory on the latest coqorg/coq:dev image:
https://github.com/coq-community/graph-theory/pull/12/checks?check_run_id=1518902073
I have never (consciously) fiddled with any "native-" settings. Is there an issue tracking this?

- COQC theories/open_confluence.v
- Fatal error: exception Stack overflow
- Error: Native compiler exited with status 2

However, I had successful builds less than a week ago and this PR was merged 18 days ago.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 09:50):

There's no issue since it's a conscious change. There was an announcement made here: https://coq.discourse.group/t/about-the-native-compiler-flag-and-the-coq-native-opam-package-to-be-released-before-coq-8-13-beta1/1147

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 09:50):

If native-compute doesn't work for your project, you should disable it explicitly.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 09:51):

Maybe the steps to do so, should be added to the above announcement.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 09:52):

Basically, this should amount to adding an explicit -arg -native-compiler -arg no to your _CoqProject file.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 11:20):

this seems like a doubtful way (overall approach) to handle incompatibility with native compilation. If I just have a regular Coq library, why would I even risk having users complain about features I can't control? It seems like all library devs will want to add this preemptively.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 11:21):

my view is something like: everyone who installs coq-native is completely on their own, and can't expect any support from any library devs

view this post on Zulip Karl Palmskog (Dec 09 2020 at 11:22):

in absence of a message to this effect when installing coq-native, it seems prudent to just disallow native compilation by all library maintainers

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 12:05):

I am note sure to understand which is your point of view. Either we consider users who install coq-native are on their own and then library maintainers do not have anything to do a priori (and not even a posteriori). Or we consider that all the libraries need to support this use case (is there a CI job to that effect?), possibly by disabling native compilation.

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 12:06):

Anyway, coq-native is not installed by default; it requires an explicit action from the user. So, hopefully they know what to expect. (Am I too optimistic? Are there users who install packages they do not understand?)

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:27):

The confusion seems to come from the fact that coq-native has been added to the docker images used by many people for CI.

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:29):

I guess, this is a good opportunity to discover that some libraries do not handle native-compute. This information will be needed anyway on the opam repo with the future split approach.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 12:37):

Here is how it works: user X installs coq and coq-native, then tries to install a bunch of libraries, then opens lots of issues complaining about failing native compilation. This is a net negative for library maintainers, who have to deal with the complaints. Why would any library maintainer who is not using some fancy reflection tactic ever want their library to be compiled with native? The announcement and CEP give zero evidence that the general Coq library gets any benefits whatsoever.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 12:39):

personally, I'm going to add -arg -native-compiler -arg no to _CoqProject in all my projects, and also advise everyone else to do the same.

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:43):

Many library can be used by people that do want to use naitve-compute.

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:43):

So there is no point in disabling it if this works (most cases in the CI of Coq).

view this post on Zulip Karl Palmskog (Dec 09 2020 at 12:44):

the point is exactly as I described: we don't want users to report issues or bother maintainers when native-compute breaks everything, which it can be expected to do a lot per above

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:44):

By comparison to OCaml, you don't want to forbid compilation with ocamlopt except if you know there is a specific problem (even if yourself are only using ocamlc).

view this post on Zulip Karl Palmskog (Dec 09 2020 at 12:45):

the difference here is that ocamlopt at least seems to work on a majority of projects, and the benefits are known (speedups have been measured and are substantial)

view this post on Zulip Pierre Roux (Dec 09 2020 at 12:45):

This is also the case, although I agree, the majority is smaller (rather a large majority than almost all).

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:10):

example selling point of ocamlopt:

O'Caml programs compiled with the native-code compiler run 6 times faster (on an average) than the same programs compiled with the O'Caml bytecode compiler.

can anything similar be said for coq-native on average?

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:12):

Nope, there is a huge overhead due to native compilation.

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

@Jason Gross has probably more experience in this area btw.

view this post on Zulip Pierre Roux (Dec 09 2020 at 13:13):

The overhead is at compile time, not runtime though.

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:13):

Yes, but you pay for it when using native_compute anyways since it incurs a small compilation phase.

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:14):

You need to link stuff, which has a definite cost.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:16):

then, the "selling proposition" of allowing native-compute for an average Coq library seems terrible as of right now

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:20):

I am convinced that with the current infrastructure we're bound to displease everybody.

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:21):

That's why I think that we should have separate native compilation as soon as possible, so that if you're a user of native compute, things should be transparent for you without incurring a cost on non-users.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:23):

my view: allowing native-compute to happen for a Coq library should be completely opt-in by the maintainer regardless of the version of Coq and regardless of how Coq was installed (possibly with some way for expert users to override)

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 13:30):

Our point was precisely that by including coq-native in Docker images, maintainers would discover that their projects are not compatible with native-compute and could disable it (which will also avoid complaints from users if they try the same later on).

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 13:31):

I don't see the point of disabling native-compute if your CI tells you that it works with it (which it tells you if you are testing with 8.13 or dev right now and it doesn't fail).

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 13:31):

cc @Erik Martin-Dorel BTW

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:32):

so in other words we become guinea-pigs for native-compute

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:32):

the reasonable thing to have is a way of opting-in to become a guinea-pig

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 13:35):

I don't understand what is so bad in having the default CI setting automatically check compatibility with a standard Coq feature until you explicitly disable it.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 13:36):

I mean the only complaint that I do understand (but it hasn't really been voiced) is that it is not clear enough how to deal with these new failures right now.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:36):

per Pierre-Marie above, it has substantial overhead, and nobody has shown it has any benefits for the average project

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:36):

lots of downsides, no demonstrated upside

view this post on Zulip Karl Palmskog (Dec 09 2020 at 13:38):

I wonder how OCaml project maintainers would react if -flambda was suddenly the default in all their compilers

view this post on Zulip Pierre-Marie Pédrot (Dec 09 2020 at 13:42):

I don't know what's the plan for multicore but I am slightly afraid it'll indeed be shoved down our OCaml-user throat.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 09 2020 at 13:58):

The downsides of the current approach were discussed and indeed they do imply that docker images have to be provided both with native and non native support enabled by default.

As a quick fix tho I would have the default docker image to increase stack size using ulimit , that should greatly reduce the amount of StackOverflows users find.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 09 2020 at 13:59):

Most compatible would indeed to have the default docker image setup with -native no and provide a coq+native image for those wanting it.

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:03):

Karl Palmskog said:

Why would any library maintainer who is not using some fancy reflection tactic ever want their library to be compiled with native?

There is no relation. For example, if you forbid native compilation for Coquelicot (which does not use any native-worth reflection), then you forbid it for CoqInterval (for which it is certainly worth it).

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:06):

Pierre-Marie Pédrot said:

Yes, but you pay for it when using native_compute anyways since it incurs a small compilation phase.

Keep in mind that a user who wants to use native_compute has a use case for it. For example, I am willing to pay the compilation overhead (a few seconds at most) for the interval tactic, because I am then running it on some goals for several minutes, if not longer. So, the x6 speedup is certainly worth it.

view this post on Zulip Christian Doczkal (Dec 09 2020 at 14:14):

Pierre Roux said:

The confusion seems to come from the fact that coq-native has been added to the docker images used by many people for CI.

This is precisely the issue for me here. I get a "random" stack overflow from one day to the next when trying to build my library using the coq images. Without any further information, I tried to come here to see what's going wrong. Now, even after reading the post on discourse I still have no idea. I'm seriously tempted to just to what @Karl Palmskog suggests and disable it.

But how does such a setting affect users of ones library that have coq-native installed? Will they still be able to use the library, even though it has not been built for native comutation?

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:15):

Disabling it was precisely my suggestion in your case.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:15):

My point is that it doesn't have to be done if CI doesn't break.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:16):

Yes, users will still be able to use the library even if they have coq-native installed.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:17):

By disabling you prevent a compilation failure while they try to install the library via opam BTW (the whole point of making CI test this by default).

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:18):

However, it forces libraries that depend on yours to also disable native compilation, and so on transitively.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:21):

Anyway, my takeaway is this:

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:21):

so the whole Coq ecosystem is supposed to run into these stack overflows, then figure out for themselves that they need to add the disable command into _CoqProject?

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:23):

this is to me terrible distribution of knowledge, most people (nearly everyone) have no idea about the intricate tradeoffs discussed by Guillaume, and are not in a position to make choices for themselves on coq-native

view this post on Zulip Enrico Tassi (Dec 09 2020 at 14:24):

What about providing 2 CI images (one with, and one without coq-native)?

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:25):

this would not help unless the one without coq-native is the default

view this post on Zulip Enrico Tassi (Dec 09 2020 at 14:25):

(which I would be fine with)

view this post on Zulip Christian Doczkal (Dec 09 2020 at 14:26):

Yes, that would be my expectation as well. Something that breaks stuff for many people out there with gains for few should not be the default IMHO.

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:28):

Can we have an actual estimation of what "many people" means here? For now, I have seen only three complaints, but maybe I missed some.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:29):

maybe someone can then start to go through the released coq opam archive and see how coq-native works out?

view this post on Zulip Christian Doczkal (Dec 09 2020 at 14:30):

Indeed, currently this only hits people who regularly build against the coqorg/coq:dev image.

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:31):

for a new default not known to many people outside the dev team, it seems strange to draw conclusions so early from existence/non-existence of complaints

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:32):

I think a new proposed default should have the burden of showing that the "average project" is not hindered/bothered by it

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:33):

Yes, so let us stop discussing about it until we can actually draw some conclusion. This is just pointless.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 09 2020 at 14:41):

Guillaume Melquiond said:

Yes, so let us stop discussing about it until we can actually draw some conclusion. This is just pointless.

No this is not pointless, some of us have been working having native supported more widely for a while [including fixing lambda which provides another 20% in some cases] and we had a good understanding; you don't need to have a lot of adivinatory powers as just inspecting all the breakage that this default caused in Coq's CI in enough to predict the wider problems for a large set of libraries if the default was set to native=yes. That seems the wrong choice for Docker.

In fact, all these problems were predicted in the discussion, the choice was to ignore them, but that doesn't mean they don't exists. Karl's point is much valid and this new setup is a disaster for users, and even a lack of respect given the time they will have to spend to figure things out. Many of them won't even know what "Coq native" is.

Going back to Docker, I'd try my ulimit tweak as first step, if that still creates problems I'm afraid that Docker setup should be similar to OPAM, default image should not force native compilation. In fact, there is very little reason with current coq_makefile support for native to enable native when working on a library, except in a handful of very specific cases; for the rest you just incur in a huge pointless overhead. Dune's native support will already disable native anyways for development mode unless you are producing an package for release.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 14:46):

But don't you want CI (we're talking about CI here) to test what you will ship in your release before you do actually release?

view this post on Zulip Karl Palmskog (Dec 09 2020 at 14:46):

thanks for the summary Emilio, I agree with your analysis and proposed plan

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:54):

Emilio Jesús Gallego Arias said:

In fact, all these problems were predicted in the discussion

I do not know which discussion you are referring. In the CEP for native compilation, there have been 122 comments. Which of those comments are predicting anything about this Docker issue? As a matter of fact, the word Docker does not even appear in the whole discussion until the very last comment. Nobody predicted there would be an issue with the Docker image. And if you did, you forgot to voice it out.

view this post on Zulip Guillaume Melquiond (Dec 09 2020 at 14:59):

In fact, Erik even asked if someone would object to changing the Docker image. Why did you not object since you knew it would be such a mess?

view this post on Zulip Christian Doczkal (Dec 09 2020 at 15:04):

Théo Zimmermann said:

But don't you want CI (we're talking about CI here) to test what you will ship in your release before you do actually release?

My reason for CI testing against coq.dev that I want to see compatibility problems one-by-one as they are introduced by the development process and (hopefully) find backwards compatible fixes for them one at a time. That this code will be packaged and shipped is a an afterthought at this point.

view this post on Zulip Erik Martin-Dorel (Dec 09 2020 at 15:15):

Note that the new behavior (the fact that the -native-compiler option is now propagated to coqc) only deals with Coq 8.13+beta1 and Coq dev.

As mentioned by @Guillaume Melquiond, the crux of this issue is that even if a library is fully orthogonal with native_compute, it still needs to be precompiled for native as soon as it appears as a dependency of another library that wants to benefit from native_compute.

Regarding @Enrico Tassi's suggestion:
On the one hand, distributing twice more images for each coq(>=8.13) version does not seem to scale well (since this week, there are now 4 different ocaml flavors for each coq version, each in its own image, as it has been discussed in https://github.com/coq-community/docker-coq/issues/12 and as I need to announce it today-or-tomorrow BTW).
On the other hand, assuming we did that way (having 8 images per coq version), it wouldn't be very satisfactory as a solution, because it would just amount to say: if a given library has an issue with native-compiler, the maintainer just ignores the issue rather than "fixing" it by disabling the option; so in the very end, if users do have the coq-native package installed in their switch, the issue will still be here… and will still need to be fixed "upstream".

view this post on Zulip Erik Martin-Dorel (Dec 09 2020 at 15:16):

All in all, I fully agree with @Théo Zimmermann's remark:

Théo Zimmermann said:

Anyway, my takeaway is this:

view this post on Zulip Erik Martin-Dorel (Dec 09 2020 at 15:17):

Sorry I'm busy with grading assignments this afternoon, so if some of you have the time to post below the Discourse post (URL updated) a summary or so regarding advice for maintainers to tweak the -native-compiler flag for native-incompatible libraries, feel free to go ahead!

view this post on Zulip Erik Martin-Dorel (Dec 09 2020 at 15:24):

Two pointers just in case:

view this post on Zulip Erik Martin-Dorel (Dec 09 2020 at 15:30):

(and there's also @Emilio Jesús Gallego Arias's suggestion regarding ulimit that could be explored)

view this post on Zulip Jason Gross (Dec 09 2020 at 16:21):

Should we be using -native-compiler no or -native-compiler ondemand?

view this post on Zulip Jason Gross (Dec 09 2020 at 16:23):

Regarding the benefits of native_compute, fiat-crypto has one example where the input program is small, the output is small, but there's a lot of computation being done, and here the native compiler shines giving at least 2x speedup over the VM IIRC. In all other cases I've encountered it's worse than the VM due to compilation or readback overhead.

view this post on Zulip Christian Doczkal (Dec 09 2020 at 17:15):

Assuming I actually want to learn about the native-compute feature, i.e. what it can and can't (yet) do. Where would I start? Also, my impression from this discussion is that while coq-native is to be "unleashed" onto everyone per default, there seems to be no reasonable expectation for it to actually work even on projects that do not mention native anywhere. (Nobody has responded that hitting a native-compute stack overflow in an a "pure .v" development is a problem that should be fixed...)

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 17:16):

I've split up the discussion on OCaml variants to a separate topic.

view this post on Zulip Théo Zimmermann (Dec 09 2020 at 17:17):

Nobody has responded that hitting a native-compute stack overflow in an a "pure .v" development is a problem that should be fixed...

It was expected that these problems would happen, just as it is expected that coqchk will run into issues on some projects. This doesn't mean issues cannot be open and problems cannot be fixed.

view this post on Zulip Christian Doczkal (Dec 09 2020 at 17:26):

The question is, should I expect to be able to compile my "pure .v" development without tiptoeing around deficiencies of native-compute and without locking all (potential) users out of using native-compute before 8.13 is considered release ready. (Presumably, nobody would release a new version of gcc wich in the default configuration stack-overflows on 1% of all projects using it).

view this post on Zulip Pierre Roux (Dec 09 2020 at 18:28):

Jason Gross said:

Should we be using -native-compiler no or -native-compiler ondemand?

In both cases, no cmxs will be produced. The only difference is when your code contains native_compute calls, in the first case (no), coq will fallback to vm_compute, in the second case (ondemand), it will precompute the missing cmxs on the fly.

view this post on Zulip Pierre Roux (Dec 09 2020 at 18:36):

Guillaume Melquiond said:

Can we have an actual estimation of what "many people" means here? For now, I have seen only three complaints, but maybe I missed some.

For what we know now : in Coq's CI, about one third of the tested projects required explicit disabling of native (so two third compile with native). It is likely that projects tested in the CI are larger than average so the figure in the wild could be much smaller.

view this post on Zulip Christian Doczkal (Dec 09 2020 at 20:18):

Does that mean that when I set -native-compiler ondemand and people want to use native_compute on things from those parts of the library that do work with the the native compiler, they can still do so?

view this post on Zulip Pierre Roux (Dec 09 2020 at 20:23):

nope, the difference between no and ondemand is only for you

view this post on Zulip Pierre Roux (Dec 09 2020 at 20:25):

(if by "people" you mean "user of your library")

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 00:35):

FYI, the difference between coqc -native-compiler no and coqc -native-compiler ondemand is summarized in this array taken from the CEP.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 00:38):

so to sum up, if a library does not support (in a coq-native switch) a whole native precompiling:

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:16):

native-compiler's extra expressivity is likely to be only useful in limited scenarios

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:18):

and I was also a bit surprised when coq-native remained enabled on the Docker images... I was honestly hoping that now we'd get _flambda_ enabled instead, giving everybody 15-20% speedups

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:19):

(probably that'd need discussion too for potential regressions)

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:20):

I'm happy enough building my own images with my own opam package, but I'm sure many Coq users prefer otherwise.

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:29):

https://coq.zulipchat.com/#narrow/stream/237656-Coq-devs.20.26.20plugin.20devs/topic/recent.20changes.20to.20the.20native.20compiler.3F/near/219335930

standard Coq feature

@Théo Zimmermann I respect you deeply but I happen to disagree; native-compute should be labeled as experimental. ondemand has extremely limited use-cases, and was IMHO never a good default; native-compute=yes has specific use-cases and the recent changes should make them easier to experiment with. native-compute=no works.

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 02:31):

and to be clear, I'm interested in anything that can help performance, including native-compute! And I do hope more people will now be able to try it out!

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 08:23):

Paolo Giarrusso said:

and I was also a bit surprised when coq-native remained enabled on the Docker images... I was honestly hoping that now we'd get _flambda_ enabled instead, giving everybody 15-20% speedups

All the Docker images that support flambda are indeed (already) flambda enabled and the default OCaml version is soon going to switch to 4.07, which is one of those versions.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 08:32):

@Erik Martin-Dorel It seems to me that we have received enough blacklash from users already to backtrack on the initial choice to inculde coq-native in the default image starting with 8.13

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 08:33):

It was a bit unfortunate that this was not discussed in the initial CEP as well with the "experimental status" of native-compute, but nobody thought about it at the time.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 09:26):

I also think @Erik Martin-Dorel should consider adding a post-installation warning of some kind for the coq-native opam package, like "Experimental: compilation of Coq files may now fail unexpectedly."

Ideally, something like the following as well: "Coq file compilation failures after installation of this package should be reported to Coq developers, not to Coq project maintainers"

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 09:36):

Well, they should actually be reported to the opam-coq-archive so that native-compute can be disabled by default to those packages.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 09:42):

this to me sounds like you're saying opam archive maintainers should shoulder the burden of fixing problems with native-compute. How are we supposed to do this if they are not using coq_makefile? There is no guarantee that just because we have a package, we can control arbitrary options sent to coqc.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 09:51):

Can't we have 1 more image (for one compiler with has no known shortcoming with native_compute) with coq-native and have it off in all other images (to cause no breakage)?

view this post on Zulip Enrico Tassi (Dec 10 2020 at 09:53):

If we want to lint any set of packages w.r.t. a feature, I think it would be more polite to do it ourselves, running jobs with coq-native installed in our CI, not the user's one.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 09:58):

what I would like to be the case w.r.t. the coq opam archive: a maintainer of a package should have to explicitly opt-in to have his package be built with native-compute. If he doesn't do this, native-compute should not be possible to use with the package for regular opam users (expert users could still override)

view this post on Zulip Karl Palmskog (Dec 10 2020 at 09:58):

this to me is the the only reasonable way to handle a feature that can arbitrarily break compilation/build

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:08):

I think @Théo Zimmermann still has a point. If a user installs both A and B, and B requires coq-native, then A is recompiled, and it may be the case that A breaks with coq-native around. It is may hence be nice to explicitly turn off native in A's package, even if A is.

We can add a job to the opam archive to test if the package happens to break with native (even if it does not require it). @Karl Palmskog if you agree I can try to add this job. If the package uses coq_makefile, a fix in _CoqProjects may be sufficient. At worse, one can declare that the package conflicts with coq-native (this works for any build system).

view this post on Zulip Karl Palmskog (Dec 10 2020 at 10:09):

@Enrico Tassi it's fine if you add an opam archive job with native-compute if this job does not affect the green/red status of CI

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:14):

With just my hands, I can't. But maybe @Théo Zimmermann maybe can help.

I can make the job "allowed to fail", but then it would be unnoticed, unless a bot posts a message in the discussion informing about the thing and suggesting the PR author what he can do. You would still see the green light, and could decide to merge even if the package is not "clean" w.r.t. native.

Maybe there are better ways.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 10:26):

let's say that 1/3 of projects break with native-compile, this means we would have CI red 1/3 of the time when it should be green, which almost makes the green/red signal meaningless. I think this would be a big loss.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:34):

if the job is "allowed to fail" then it does not show up in the CI box and does not contribute to "being red". It is run, and ignored.
the bot could still do something useful out of its result.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:35):

I was proposing the bot to do something like an automatic review/comment based on the result of that specific job.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:36):

Or even propose a change. But I don't know at which extent this can be done, Theo knows.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:40):

Apparently there are API to put a review comment on a line (eg the bottom of the opam file). The bot could suggest to add conflicts: [ "coq-native" ] when that job fails. WDYT?

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 10:46):

Karl Palmskog said:

what I would like to be the case w.r.t. the coq opam archive: a maintainer of a package should have to explicitly opt-in to have his package be built with native-compute. If he doesn't do this, native-compute should not be possible to use with the package for regular opam users (expert users could still override)

If this was the objective, then the changes to Coq were not needed and the CEP could have been implemented with the strategy proposed for older versions of Coq. It's pretty sad that this position was not expressed during the CEP discussion.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 10:47):

@Enrico Tassi You can already add allow failure jobs. They will be reported with the neutral check (gray square). We are using them (and it works well) for async jobs on the main repo but it is also supported in the opam-coq-archive repo.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 10:48):

I can implement whatever you want in addition (such as suggested changes) but I need a clear feature request for that (tell me exactly what you want).

view this post on Zulip Enrico Tassi (Dec 10 2020 at 10:49):

great, I will do that then.
if you give me some hints, I may give a try to patching coqbot. It's like the message for jason about the failure job, but it should be reported as review, rather than as a regular comment. I'd like to test if one can suggest "a patch".

view this post on Zulip Karl Palmskog (Dec 10 2020 at 10:58):

@Théo Zimmermann can you elaborate on where the coq opam archive is mentioned in the CEP as the place (dumping ground) to handle native compilation issues by users?

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 11:00):

Nowhere, but it is written in it that native-compute will become available for all packages without any action from the maintainer, and now you are saying it shouldn't be the case. It would have saved much trouble if you had expressed this opinion earlier.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 11:09):

I really don't care if native-compute becomes somehow available (it would have been available with the Coq opam compiler apprach as well). What I care about is who is going to take the fall and apply fixes when things arbitrarily fail. For example, it only now became apparent that 1/3 of the Coq CI doesn't support native-compute (and fails with an inscrutable error message like "stack overflow"). One would think this had been pertinent information when discussing the CEP

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 11:12):

Disabling native-compute is supposed to be an easy step. Patching the opam archive to do it when we know a package fails is just a way of sharing effort between native-compute users. Just because you are a maintainer of the archive and you personally don't want to have to care about it doesn't mean that it's not useful or worthwhile.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 11:16):

but if it is worthwhile, where is the evidence? in this thread, we have had maybe 3-4 projects report some anecdotal benefit, and we have the 1/3 failed figure from Coq CI. Given this state of knowledge, I would still recommend everyone who is a Coq project maintainer to turn it off locally in their projects

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 11:20):

I'm not a user of native-compute myself, so I cannot answer this, but it looks to me like there are a number of users in this thread that have repeatedly insisted on how important this feature is. (And they can probably point out to actual evidence of this.)
That it is not working well on all projects is certainly in part due to inherent limitations but also to the lack of sufficient testing so far.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 11:21):

Finally, why again would you still recommend every maintainer to disable it by default since the whole point of the CEP was to make it available by default?! If the objective was different, then it could have been achieved differently!

view this post on Zulip Karl Palmskog (Dec 10 2020 at 11:25):

the recommendation would be based on:

I think it's very strange that Coq project maintainers would be expected to welcome features that do not come with any clear upside, and a large possible downside.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 11:35):

but with that said, this recommendation is based on the current state of knowledge, and so could change if benefits are actually shown in benchmarks, and there is no flood of error reports

view this post on Zulip Karl Palmskog (Dec 10 2020 at 11:45):

but just to be clear, I'm not really interested in seeing the exact benefits for one or two specific projects, I'm interested in seeing figures for many (100+?) relevant projects, i.e., averages.

view this post on Zulip Pierre-Marie Pédrot (Dec 10 2020 at 11:47):

Given that native_compute is a tactic you have to use explicitly, I don't see how you would magically see anything for a random project.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:01):

It seems that you are completely confused because you expect benefits on average like with flambda when this has nothing to do with this.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:02):

The benefits are only for the users who wish to rely on native-compute (and will thus install coq-native in their switch).

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:03):

There are certainly no expected benefits on the build speed of average projects. On the contrary, it makes compilation a bit longer!

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:04):

But nobody suffers from this if they do not install coq-native!

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:04):

That was the whole point of the CEP: let the user decide!

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:04):

So yes, maybe (probably) it was a mistake to install coq-native by default in the Docker images.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:05):

But this is beyond the current discussion which has shifted to a complete utter nonsense.

view this post on Zulip Pierre-Marie Pédrot (Dec 10 2020 at 12:05):

Well, let's see the good side: at least now many people (including us, coq devs) are aware of the breakage introduced by native compilation.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 12:05):

Yes, that was one of my point above (until now, it was not sufficiently tested).

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:20):

I didn't confuse anything. We are talking about why or why not I recommend people to use certain options. If the benefits of including an option outweigh its costs, I recommend it. In this case, I judge that adding -arg -native-compiler -arg no to _CoqProject should be recommended, since it means you will not ever have to pay the costs of having failing native-compute builds, and your users will not complain about stack overflows when installing your projects.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:24):

most users will not be able to tell what coq-native means, so they may install it by accident or use a Docker image where it is pre-installed (and I guess Nix Coq builds will have it as well). The option I recommend completely rules out any problems from this.

view this post on Zulip Pierre Roux (Dec 10 2020 at 12:25):

Karl Palmskog said:

we have had maybe 3-4 projects report some anecdotal benefit

Don't forget that up to now native_compute was mostly unused because it was terribly inpractical to use (basically, everything had to be recompiled manually with specific options). So you cannot conclude to the uselessness of native_compute from that observation.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:30):

@Pierre Roux this is not about concluding the uselessness of coq-native. It's about the situation for regular Coq project maintainers - I'm arguing that for most of them the new coq-native infrastructure will be a tax imposed from above without any personal benefit. Most will have no idea what coq-native is all about, yet they could see their projects fail in Docker and users claim the project fails compilation. The _CoqProject option will allow them to avoid all these potential problems.

view this post on Zulip Pierre Roux (Dec 10 2020 at 12:34):

But then, they may experience complains from coq-native users that they disabled it, possibly for no reason. There seems to be no perfect solution.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:36):

it seems to me that coq-native users are generally savvy and highly sophisticated and can figure out workarounds, so it seems far more likely that user complaints will come about coq-native failures rather than absence of coq-native.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:37):

I agree that this is in the end a question of: whose interests matter more?

view this post on Zulip Pierre Roux (Dec 10 2020 at 12:39):

You may also consider that, looking at Coq CI, the projects that seems to fail with native compute are not the random proof project but projects doing complicated things, with for instance very large non opaque terms, those users woul probably also know what they are doing and know at least a bit about native compute. Again, the average user doing regular proofs is very likely not to be impacted (but its proof could be of use for others).

view this post on Zulip Karl Palmskog (Dec 10 2020 at 12:44):

on average, it seems unlikely that the "transitive coq-native compilation benefit" will be realized for a project. The opam coq archive dependency graph is quite shallow - not many projects are used as dependencies. Moreover, I think this is like saying you should take one for the team - I should allow this failure vector so someone else I don't know about will gain.

view this post on Zulip Pierre Roux (Dec 10 2020 at 12:47):

We already had examples so it is definitely possible, how likely is impossible to know today, as already said. "unlikely" is only your personal opinion.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 12:57):

Hi all, I've just seen your comments − I guess the usefulness of the coq-native packaging was discussed in detail in the CEP, and for both use cases:

But after installing the package by default in the Docker packaging, I wasn't expecting to get as much backlash, following these failures with the Coq.8.13+beta1 or coq.dev builds… Anyway I guess what suggested Théo and Enrico looks a very sensible trade-off, very easy to setup and to document:

So I plan to start rebuild these images ≥ 8.13 from now on, but could you please comment on which strategy you'd prefer for older images?
(A) do the same for older versions (8.7 ≤ Coq ≤ 8.12) as for Coq 8.13+ (also leading to five different images, one with OCaml 4.07.1 + coq-native (albeit the "automatic propagation of the -native-compiler yes flag" especially concerns Coq 8.13+))
(B) keep the current four images as is, i.e., with coq-native, with the stdlib native-precompiled only, but allowing to native-precompile selected libraries later on, when the item 3 of the CEP is gradually taken into account.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 13:00):

the most depended-on package in the opam archive is coq-mathcomp-ssreflect. Yet, it is only depended on by 32 packages. coq-ext-lib, a general library, is only used by 6 other packages. The upper bound on number of dependencies (and thus potential for transitive coq-native benefits) look terrible compared to the Java ecosystem, where packages regularly depend on 40+ packages (see this paper: https://www.cs.cornell.edu/~legunsen/pubs/GyoriETAL18EcosystemRTS.pdf)

view this post on Zulip Enrico Tassi (Dec 10 2020 at 13:00):

I think A is safer.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 13:03):

I would also vote for (A)

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 13:22):

+1 for (A): I had no crystal ball, but I also expected native-compute to be 100% opt-in. native-compute support for package foo should probably be a feature request from users of foo who are actually interested.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 13:30):

Paolo Giarrusso said:

+1 for (A): I had no crystal ball, but I also expected native-compute to be 100% opt-in. native-compute support for package foo should probably be a feature request from users of foo who are actually interested.

Really? Then once again, the part about changing the default value of the coqc -native-compiler option in Coq would not have been necessary. You are the second participant to this CEP to express this opinion now.

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 13:30):

BTW, option (B) is the most backward compatible for older Coq versions: since the stdlib used to be pre-compiled.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 13:35):

How can it be less backward compatible? If you have job for 8.9, then A says that that job will continue to download the same image as before, while B does not imply that. IIUC

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 13:37):

Théo is right regarding backward compatibility (apart from macOS users). See the opam-repository PR that implements the CEP.
I will repost the URL.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 13:38):

OK, maybe I'm lost, no problem. Then I vote for "the option that has the least chance of breaking things for users not opting in".

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 13:48):

Here is the link: https://github.com/ocaml/opam-repository/pull/17690/files#diff-655d9f2dcb2ffb68e9326406cdc3a01a43ba35d4e496a87f624f2063e61ab7dd

you can see that beforehand for Coq 8.12.0, we had this line:

    "-native-compiler" {os = "macos"} "no" {os = "macos"}

so beforehand, native-compiler was unconditionally disabled for macos (which was maybe a too abrupt choice BTW, but this is another story),
and enabled otherwise (just for the stdlib for Coq < 8.13).

now:

     "-native-compiler" "yes" {coq-native:installed} "no" {!coq-native:installed}

so if coq-native is not installed, the stdlib won't be precompiled for Coq < 8.13.
Thus strategy (A) in my previous comment would indeed lead to a non-backward-compatible change for the Docker images of these old Coq versions.

Personally, I'm very OK with both strategies (A) and (B).
But maybe if we have not a very strong opinion on this last choice, we could also defer this choice for, say, one week (keeping the strategy (B) for the moment and only remove coq-native from Coq 8.13+), then do the announcement that I was planning for the single-switch Docker migration, and maybe applying (A) next week depending on the feedback we might get in the meantime? WDYT?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 13:49):

Non-locality of the depext setup was indeed discussed in the CEP, and it is a problem. Unfortunately if you disable native in a package locally you get a broken switch in the moment some other package depends on it. The only solution I can see with opam is to actually add a conflicts: coq-native constraint.

Note that all the breakage we are seeing is due to either a) lack of memory or b) small default stack size, [which can be fixed with ulimit], so indeed I would expect coq-native users to bump ulimit by quite a bit if they want to use it. I'd conjecture that for a large enough ulimit the whole archive should compile.

Note that native + flambda will give nice speedups but will eat even more stack, so IMHO it is reasonable to keep that in mind. coqc tries to workaround this by unconditionally passing -O classic to OCaml, this effectively removes many benefits from flambda but provides a safer setup. Super-specialized users would like to tweak Coq so -O 3 is passed instead, as they could likely reap on the benefits.

By the way flambda provides speedups on the range of 2%-18% perfect, just to mention that not everyone will greatly benefit from it, but indeed quite a few users see a speedup.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 13:51):

Hi Emilio, just to be sure, could you elaborate a bit on what you mean by "Non-locality of the depext setup" ?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 13:58):

Erik Martin-Dorel said:

Hi Emilio, just to be sure, could you elaborate a bit on what you mean by "Non-locality of the depext setup" ?

Non-locality means that a package A may pull coq-native , forcing recompilation of unrelated package B with different options that B is not aware.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:02):

OK thanks! (actually I guess you meant the "non-locality of the depopts setup")

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:02):

Anyways I'd be pragmatic here as I think we'll have a better setup for 8.14; as @Erik Martin-Dorel is discovering touching anything related to build setup is a hairy issue as I've lived on my on flesh :) sending courage as this is hard and exhausting work :)

Thus we could:

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:02):

Erik Martin-Dorel said:

OK thanks! (actually I guess you meant the "non-locality of the depopts setup")

Oh yes, sorry :/

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

This way coq-native seems pretty opt-in as not to cause a lot of harm in non-expert users, and we would have a good understanding for the future setup with split packages: basically, packages that have the conflicts would not be a candidate to provide coq-pkg-native in the split setup.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

The split setup would signal a bit better what packages cannot be compiled yet with native.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:04):

Note that I have submitted https://github.com/ocaml/dune/issues/4020 which will make this process automatic for packages using dune

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:05):

but if the split-native approach is ready for 8.14, note anyway that the coq-opam-archive packaging maintenance overhead that is induced by that approach should be solved beforehand…

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:05):

coq_makefile should not be a lot of problem to handle, but indeed the split setup will still have some hurdle as maintainers will have to enable support explicitly, but I much prefer that as IMHO it is reasonable that package maintainers do test native before enabling it in their packages.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:06):

Erik Martin-Dorel said:

but if the split-native approach is ready for 8.14, note anyway that the coq-opam-archive packaging maintenance overhead that is induced by that approach should be solved beforehand…

Not sure I fully understand, please feel free to correct me. Ideally, the migration to a split setup would be opt-in

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

in the sense that before coq-native , after you implement the split setup, no package will provide a native setup

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

but coq

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

with coq-stdlib-native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

then you could update packages as you are able to test them

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:07):

in a one-by-one basis

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:08):

the thing is that indeed I think that automatically having all the packages compile with native seems a bit too risky for me

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:08):

in the sense that if you do that, you are not testing them beforehand

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:08):

I think a slower but less risky setup is to enable packages as you test them

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:09):

but indeed how to make the life of packagers easier I dunno, for dune that will be automatic, but other people with opam, I guess they should update from a template? [That's how it works today right?]

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:09):

I also think that the split-native approach is very nice, but from the opam-coq-archive perspective, this adds many packages to maintain!
(Karl already mentioned that tweaking the existing opam specs for non-native-compatible packages might be a concern, so it will be a yet more complex task if all native-compatible packages have to be duplicated for the split-native approach!)
This could be solved IMHO just if the generation of the coq-lib-native packages can be automated.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:09):

I'm trying to add a job to the opam archive so that we can test native when people submit packages https://github.com/coq/opam-coq-archive/pull/1504
can someone remind me of a package that fails? (to test the infrastructure)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:11):

Erik Martin-Dorel said:

I also think that the split-native approach is very nice, but from the opam-coq-archive perspective, this adds many packages to maintain!
This could be solved IMO just if the generation of the coq-lib-native packages can be automated.

Yes indeed, it should be automated. I don't think it would be very hard to update the opam templates to do so, but I am not very familiar. For the dune-based packages they'll know what to do.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:13):

Thanks Enrico!
Enrico Tassi said:

I'm trying to add a job to the opam archive so that we can test native when people submit packages https://github.com/coq/opam-coq-archive/pull/1504
can someone remind me of a package that fails? (to test the infrastructure)

I guess there is one of the packages mentioned by @Christian Doczkal?
(but I've just seen you already ping'ed him on GitHub :)

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:20):

@Emilio Jesús Gallego Arias Regarding the flambda topic and to follow-up a remark by @Paolo Giarrusso, it is true that it's only partly supported in docker-coq (and there is an opened issue here: https://github.com/coq-community/docker-coq/issues/10): we could indeed pass the dedicated option -flambda-opts '-O3 -unbox-closures' to coq's configure,
but I believe setting these options is not a specific issue to docker-coq but to the opam packaging, and could be promoted as a default… (as on non-flambda swiches, this option would be a no-op!)

Actually note that I had first proposed to do the two changes (coq-native and flambda-opts) in one go in the following PR:
https://github.com/ocaml/opam-repository/pull/16887 (now-closed); but I decided to look after these two enhancements one at a time, all the more as Théo had suggested to open the coq-native CEP to foster the discussion…

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:23):

So regarding -flambda-opts do you confirm @Emilio Jesús Gallego Arias that proposing another PR in opam-repository soon doing this, would be the way to go?
cf. https://github.com/ocaml/opam-repository/pull/16887/files#diff-655d9f2dcb2ffb68e9326406cdc3a01a43ba35d4e496a87f624f2063e61ab7dd
finally I know that @Paolo Giarrusso and you have done many benchmarks on "with or without flambda"; but would you advise me to have a look in a particular place if you gathered some of your benchmark infrastructure on that flambda topic, in a given repo? anyway, getting as Emilio mentioned, a speedup of "up to 18%" in the default setting would be great…

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:26):

@Erik Martin-Dorel thanks for reminding me; enabling flambda in opam packages was indeed the plan, which the coq-native change now unblocks.

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:27):

FWIW we’ve been saying no to native and yes to flambda for around 6 months throughout our company and CI with 4.07.1-flambda

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:28):

(with limited deps, since we mostly use iris and template-coq)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

These flags are already the default on Coq dune builds, and indeed make sense to pass always, I believed it was the case actually.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

Note that these flags are for compilation of Coq itself, and they have no relation to OCaml flags used in native-compute

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:30):

speedup comes from less allocation in the proof engine I think mostly

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:31):

so flambda is a kind of mechanized @Pierre-Marie Pédrot

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:31):

But yeah, flambda is a big pain to use, and with native we have a bit of the same setup now so we will find the same problems as noted in the CEP

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:32):

yes Emilio, I know that the flambda flags are orthogonal to native-compute (modulo the issue you recall in your last comment); my only question was about the need to update the coq packaging in opam-repository (which does not rely yet on dune)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:33):

I thought that was already done

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:33):

not as far as I know

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:33):

No, because despite "no relation", flambda + native is measurably slower than the default for some reason.

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:34):

We've been working around the relevant issues for ~one year, the goal here is to work around them better :-)

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:35):

yes, but what would be a good benchmark suite for that in your opinion?

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:35):

Anyway, Emilio is for the change, and my concerns are addressed

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:36):

because if we focus on native_compute, you know there are two times that can be "benched" (the compilation time and the running time)

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:36):

ah; I have some benchmark data on github coq issues, I can dig into it again if we adjusted the question.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:36):

Paolo Giarrusso said:

No, because despite "no relation", flambda + native is measurably slower than the default for some reason.

Oh indeed, yeah that's the problem with flambda's all-or-nothing approach

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:36):

Yeah even if we pass -O classic , flambda is slower compiling OCaml files

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:37):

for compiling Coq that is not problem, but it is a problem if you have native

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:37):

Good point Erik, my benchmark were just about "I never write native_compute, why do I pay for it"

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:38):

(or rather, I just enabled flambda, got slowdowns, and investigated. That's the yak we're still shaving)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:38):

yeah , I always disable native except when building binaries for distributions

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:38):

or of course if I was doing some native_compute proofs, and Dune will explicitly disable it when in dev mode

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:40):

So given the current state of the discussion, I guess we could just do this in the coq opam-packaging?

pass the -flambda-opts "-O3 -unbox-closures" option only if OCaml >= 4.07 and coq-native is not installed…

(having ideally up-to-date benchmarks for Coq 8.13+ and Coq < 8.13 that confim again this is the way to go :)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:41):

I'd directly conflict with anything less than 4.07

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:41):

Oh but that can't be done, sorry, never mind

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:42):

no indeed, the conditions don't help I think

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:42):

Coq should conflict with flambda < 4.07

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:42):

this is ocaml-variants.*

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:42):

even if you don't pass the options, you'll have trouble

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

if native is enabled

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

when < 4.07

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:43):

so coq-native should conflict with that right?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

in >= 4.07 there is no problem with coq-native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

other than you pay the price

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

a bit more price in my experience

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:43):

Emilio Jesús Gallego Arias said:

Coq should conflict with flambda < 4.07

I see what you mean. That's also the reason why there is no ocaml < 4.07 with flambda in docker-coq

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:43):

yup, actually the conflict should be of coq-native with flabmda variants < 4.07

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:44):

in opam

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:44):

other than that, all the configs are supported

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:44):

but indeed you pay for what you use

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:44):

the cost of coq-native + flambda is quite small when compared to the cost of coq-native alone

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

but still not insignificant, in particular there are some problems with flambda middle-end that impact Coq

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

and unfortunately as this is a compile-time option, you are stuck with ti

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

with it

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:45):

and either one is still a cost you don’t want as default, right?

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:45):

@Erik Martin-Dorel in any case, this is the other reason we’d want +flambda-native images as default: best performance

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

well, in my build setup, I always call coqc with coqc -native off

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:45):

except when releasing

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:46):

so I don't care too much, it is like dune will only build OCaml byte objects when you call @check

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:46):

Paolo Giarrusso said:

Erik Martin-Dorel in any case, this is the other reason we’d want +flambda-native images as default: best performance

I know, currently the default is 4.05.0 ...

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:46):

as the default outside 4.05

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:46):

Yes.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:46):

regarding opam indeed you have no control over what compiler the user installs, so coq-native should conflict with broken compilers

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:47):

Enrico Tassi said:

I'm trying to add a job to the opam archive so that we can test native when people submit packages https://github.com/coq/opam-coq-archive/pull/1504
can someone remind me of a package that fails? (to test the infrastructure)

I looked at the logs of coq-graph-theory: it fails with 4.05, but with 4.07 it apparently works, so the failure is not (only) related to coq-native.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:47):

with coq makefile you could hack this by having a make dev call, that would explictly add -native off

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:47):

and having make do the full build

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:48):

@Enrico Tassi can you paste the link to the log please?

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:48):

This is the link given by @Christian Doczkal yesterday https://github.com/coq-community/graph-theory/pull/12/checks?check_run_id=1530975359

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:49):

Grazie

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:49):

That's a low ulimit

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:49):

Question: is there a package known to fail with coq-native on 4.07 ?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:49):

@Enrico Tassi have a look at Coq's gitlab file, we already bump ulimit there otherwise flambda wouldn't work

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

@Enrico Tassi failure is relative here as it usually comes from a) not enough memory b) low ulimit

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:50):

@Emilio Jesús Gallego Arias to sum up what you said, we should mark coq-native conflicts with +flambda < 4.07 (but keep it available for 4.05 I guess)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

so depending on your worker setup you can get different results

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

I do recall seeing a true native-compiler bug somewhere

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:50):

but that's rare

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:50):

This may be a suggestion for the docker images or the actions, we can't ask users of coq community templates to bump the stack

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:51):

@Erik Martin-Dorel yes, in principle it should be available in 4.05, however if you ask me, and based on my memories of discussions about OCaml bugs, I'd just require 4.07.1 for coq-native as the minimum version

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:51):

there was this flambda bug, but there were other problems IIRC

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

@Enrico Tassi I meant for the CI

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:52):

IIRC the ocaml bug you had mentioned dealt with 4.06+flambda

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

stack is a big problem for users, as even different distros carry different defaults :S

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:52):

Maybe we could be "bold" and say that Coq 8.13 recommends 4.07? (and drop 4.05 images, and co)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

@Erik Martin-Dorel there were more IIRC

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:52):

4.05 images without coq native could be very useful to test against lower bounds

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

and even with coq-native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

in principle it should work but I seem to recall there was some other problems

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

it is basically untested

view this post on Zulip Enrico Tassi (Dec 10 2020 at 14:53):

have the configure script accept 4.05..4.07, but warn, and be happy with >= 4.07 ?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

so you would be the beta-tester for 4.05 + native [except for the stdlib]

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:53):

no idea what you will find

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:54):

but if we disable coq-native for 4.05.0, we won't be able to have native_compute for coq 8.5 and Coq 8.6 anymore:

* Coq dev: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.13: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.12: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.11: 4.05, 4.07-flambda, 4.10-flambda, 4.11-flambda
* Coq 8.10: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.9: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.8: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.7: 4.05, 4.07-flambda, 4.08-flambda, 4.09-flambda
* Coq 8.6: 4.02
* Coq 8.5: 4.02
* Coq 8.4: 4.02

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:54):

(this array was taken from https://github.com/coq-community/docker-coq/wiki)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

Yeah if you want to support that versions it's fine, just be aware that you will be testing previously untested stuff

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

so you could find no problems, or a lot of them ;)

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:54):

I don't think so: as said previously, Coq 8.5 and Coq 8.6 were already precompiled with native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:54):

only the stdlib

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:55):

that's a very small test

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 14:55):

I know, but this is precisely the question @Théo Zimmermann raised, regarding backward compatibility

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:55):

Anyways we should put some pressure to OCaml devs so they fix the bugs that are making OCaml work bad in our use case, or fix our compilation scheme :)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:56):

That would make our life much simpler :)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:56):

Actually fixing these stack overflows in the OCaml compiler is tedious, but not so complex

view this post on Zulip Paolo Giarrusso (Dec 10 2020 at 14:56):

(a) is 4.07.0 tested? Emilio always says 4.07.1 (b) how much should we be evolving 8.5 and 8.6?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

4.07 was tested by me more extensively

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

indeed by patching coq to always generate native files

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

I saw huge memory footrprints here and there

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

but things seemed to work

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 14:57):

provided I increased stack + many GiB of RAM

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:03):

I added to the next Coq Call the item: "Coq 8.13 require 4.07". From this discussion, we don't lose much by dropping 4.05.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:03):

What do we gain tho?

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:04):

this thread would not have existed, for example .

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:04):

Umm, I think this thread has little to do with it.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:04):

Note that buster is still in 4.05.0

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:04):

The problem by @Christian Doczkal is that the default image uses 4.05 + native. If you are on 4.07 his package works even with native

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 15:05):

So to put all things together, regarding the docker-coq flambda/native packaging, let me recap the plan:

  1. I'll remove now the coq-native package from all four versions coqorg/coq:8.13 and coqorg/coq:dev images
  2. Add a fifth image with coq-native for {Coq 8.13, Coq dev} × {OCaml 4.07-flambda}
  3. After the end of the single-switch migration phase, the "default" version will be 4.07.1+flambda without coq-native
  4. Don't do the same as 1. and 2. for all other Coq versions now, because it would break backward compatibility without any announcement,
    but maybe do it in one week;

  5. Submit a PR to mark coq-native conflicts with flambda < 4.07;

  6. Later on, we may submit a PR to enable -flambda-opts "-O3 -unbox-closures" as soon as the switch is a flambda variant
    (BTW Emilio how would you quantify the impact of 6, w.r.t. a "mere" 4.07.1+flambda switch without these two options?)

  7. (post edited; this item added) Add ulimit -a in the default CI code for informative purpose.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

That's a packaging problem, but it has little to do with 4.05

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

but with enabling native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

you could use 4.07 and Christian problem would have been the exact same

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:05):

on the wrong compile

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:05):

I'm testing it and it works with 4.07 and coq-native.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:05):

if Docker maintainers chose to use a untested setup as the default, what can we do, 4.07.1 is only tested with native in my own workstation, so that's not a so great improvement

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:06):

I'm testing it and it works with 4.07 and coq-native.

It may work, it may not, for example it could work on 4.07 and fail in 4.08 as the compiler did regress on some internal functions AFAICT

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

stack usage is different indeed, but not guaranteed

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

I mean OCaml's stack use

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:07):

OK, I agree we should test things. But we should decide on what. I'm adding a CI job on the opam archive so that we can detect packages that don't compile with coq-native. Which compiler should I pick?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

Good question @Enrico Tassi , that's a hard problem

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:07):

Consequently docker image will use what is tested.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:07):

as far as I know, stack use of OCaml among versions is not stable

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:08):

there are a few bugs open upstream

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:08):

so, to my understanding, we know that 4.05 has many problems. Hence my proposal to ditch it.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 15:08):

Emilio Jesús Gallego Arias said:

if Docker maintainers chose to use a untested setup as the default, what can we do, 4.07.1 is only tested with native in my own workstation, so that's not a so great improvement

can you be more specific? was it related to my latest comment recapitulating the plan?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:08):

nope @Erik Martin-Dorel , it was an ansewr to enrico

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 15:08):

Enrico Tassi said:

The problem by Christian Doczkal is that the default image uses 4.05 + native. If you are on 4.07 his package works even with native

can you show also the URL of the log that succeeds with 4.07 ?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:09):

@Enrico Tassi for example 4.08.0 may have [and indeed I think it does] the same problem of excessive memory use

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:09):

should we ditch it?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:09):

memory / stack use for compilation among OCaml versions varies a lot

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:09):

https://gitlab.com/coq/opam-coq-archive/-/jobs/902857143

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:09):

(but it is not using docker)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:10):

even the workers you get may vary on this, making CI unpredicatable

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:10):

Emilio Jesús Gallego Arias said:

should we ditch it?

yes

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 15:10):

OK thanks Enrico

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:10):

and OCaml upstream answer was "we don't care, that's not an use case we support"

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:11):

Enrico Tassi said:

Emilio Jesús Gallego Arias said:

should we ditch it?

yes

Seems a bit weak criteria, in the sense that OCaml version A requires M amount of memory to compile a quite random file, then OCaml version B require M+1 memory to compile that random file, we ditch A ?

What happens if for example version B is more efficient in some other file.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:12):

What if my distro has a more restricted stack size?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:13):

My view is simple, and in agreement with OCaml devs position on this one: if you wanna use native, get ready to increase memory limits and total memory .

or.... patch the OCaml compiler to fix these memory issues.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:13):

I can easily construct a file that will fail in 4.07 too

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:16):

This is not my point. There are Coq users that don't care which version of OCaml they use, they just want Coq to work. For them, we should chose a good version, and test that version as much as we can. If you wan to compile Coq with a version of OCaml of your choice you can, but in the docker images we provide for testing, and in the binary installers we provide, and in the installation scripts we provide (the platform) we should pick sensible versions.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 15:17):

one gain of requiring >= 4.07: extraction can stop emitting code that is deprecated (https://github.com/coq/coq/issues/11359)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:17):

Enrico Tassi said:

This is not my point. There are Coq users that don't care which version of OCaml they use, they just want Coq to work. For them, we should chose a good version, and test that version as much as we can. If you wan to compile Coq with a version of OCaml of your choice you can, but in the docker images we provide for testing, and in the installers we provide (eg the platform) we should pick sensible versions.

I agree, but how this is related to us supporting some versions? We support 4.08 but we don't recommend it.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:18):

./configure could accept but warn. And starting from that we could stop providing images for versions we know don't work well for us to avoid users discover it themselves.

view this post on Zulip Karl Palmskog (Dec 10 2020 at 15:19):

the obvious drumbeat for OCaml version choice is the Coq Platform. Michael settled on 4.07.1 for good reasons.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:19):

I personally was not aware of so many (performance/memory) differences in these ocaml compilers. I can only imagine Coq users which don't even know how to code in OCaml.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:20):

The example you provide is pretty fragile, you cannot generalize on memory behavior just from one particular file. There are files where 4.05.0 will use less memory. This whole thing is very misguided IMHO.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:20):

I was aware, and I warned in the CEP .

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:20):

But we are derailing here really.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

coq-native should be opt-in

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

for now

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:21):

I guess I did not understand your warning then

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

PMP also warned

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

as of today, even 4.12.0

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

does require huge amounts of memory in some situations

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:21):

see the discussion on the OCaml bug tracker

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:22):

so we cannot fix this by using a particular version

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:22):

of OCaml, all versions will eat lots of ram and stack with native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:22):

that's my understanding

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:24):

I think PMP and I were pretty clear [and we have been involved in the upstream discussion]

It is great we can try to test a large amount of devs in the CI , I fully support that, and I supported the CEP in the end because even if far from perfect we need to make progress. But unfortunately , unless there is significant work on the OCaml compiler or in our compilation strategy, coq-native will still be quite expensive for the regular user.

Also, we have a way more critical bug which is that the compilation scheme may stop working with newer OCaml versions.

view this post on Zulip Enrico Tassi (Dec 10 2020 at 15:30):

I think it was clear that there were problems. It was not clear to me that 1) docker images would have opt-in by default 2) problems were varying depending on the compiler version and custom settings (like the stack).

Anyway, I think the scenario is clear now.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:44):

Yup, that's clear, would be great if we could understand which version is best for native

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:44):

I think that 4.07 may indeed be better than 4.08-4.10

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 15:46):

There is plenty of reports of that problems upstream, see https://github.com/ocaml/ocaml/issues/10072

view this post on Zulip Christian Doczkal (Dec 10 2020 at 20:08):

For what it's worth, I just reran the graph-theory CI using the mathcom/mathcomp-dev:coq-dev image, which is based on ocaml-4.07.1+flambda and I still get a stack overflow: https://github.com/coq-community/graph-theory/runs/1532521296

view this post on Zulip Théo Zimmermann (Dec 10 2020 at 20:09):

@Enrico Tassi and @Karl Palmskog If you want to "deprecate" 4.05, this page should be updated: https://coq.inria.fr/opam-using.html

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 20:34):

Christian Doczkal said:

For what it's worth, I just reran the graph-theory CI using the mathcom/mathcomp-dev:coq-dev image, which is based on ocaml-4.07.1+flambda and I still get a stack overflow: https://github.com/coq-community/graph-theory/runs/1532521296

Hi @Christian Doczkal that's explained because I had only propagated the removal of coq-native in coqorg/coq:dev and mathcomp/mathcomp:latest-coq-dev (manual rebuild), but not mathcomp/mathcomp-dev:coq-dev yet, sorry… I'll take care of this just now. Thanks for your comment!
(Note: this can be checked thanks to this line:
https://github.com/coq-community/graph-theory/runs/1532521296#step:4:295 )

view this post on Zulip Christian Doczkal (Dec 10 2020 at 20:37):

@Erik Martin-Dorel , the point was that @Enrico Tassi had claimed that the development builds correctly, even with coq-native enabled, on ocaml-4.07.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 20:38):

Yes but IIUC, the only difference between the build that fails or not seems related to the default stack size, which could differ in opam-coq-archive CI, and GitHub Container Actions…

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 20:39):

As recalled by Emilio with the issue he mentioned from the ocaml bug tracker, this is a typically known bug related to the use of ocamlopt w.r.t. large codes, automatically generated or so.

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 20:46):

@Emilio Jesús Gallego Arias BTW do you think it would be useful (without modifying anything for now, but at least for informative purpose) to add the following command somewhere in the default CI scripts?

ulimit -a

view this post on Zulip Erik Martin-Dorel (Dec 10 2020 at 20:59):

Erik Martin-Dorel said:

Hi Christian Doczkal that's explained because I had only propagated the removal of coq-native in coqorg/coq:dev and mathcomp/mathcomp:latest-coq-dev (manual rebuild), but not mathcomp/mathcomp-dev:coq-dev yet, sorry… I'll take care of this just now.

FYI @Christian Doczkal the build is now on-going: https://gitlab.com/math-comp/math-comp/-/pipelines/228167390
(so feel free to restart your build involving mathcomp/mathcomp-dev:coq-dev as soon as this pipeline is completed).
anyway for the next time, this will be automated since I've merged https://github.com/coq-community/docker-coq/pull/21

view this post on Zulip Enrico Tassi (Dec 10 2020 at 22:38):

I guess the number of variables made it hard to identify the culprit with reasonable certainty. I think we should try to find a way to reduce that, at least the number of variables visible by users of standard stuff.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 10 2020 at 22:53):

@Erik Martin-Dorel sure, we didn't care so much as of today given that SO where rare, but it cannot hurt. We could even do a check in the native compiler for current stack size, but these issues are kinda tricky in general as there is no agreed upon recipe w.r.t. what the OCaml compiler needs.

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:30):

Théo Zimmermann said:

Paolo Giarrusso said:

+1 for (A): I had no crystal ball, but I also expected native-compute to be 100% opt-in. native-compute support for package foo should probably be a feature request from users of foo who are actually interested.

Really? Then once again, the part about changing the default value of the coqc -native-compiler option in Coq would not have been necessary. You are the second participant to this CEP to express this opinion now.

good question, and I've thought about it. Back then, the plan was that native-compute would be opt-in, but we meant _for users_: this was an experimental feature that impacted you even when you didn't use it. We estimated _some_ pain for maintainers, but it only was creating a second package, and the goal of the discussion was limiting that pain.

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:32):

my main goal in the design was to get native-compute completely disabled for everybody, since today that's still the correct default (IMNSHO)

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:36):

also, it's the correct short-term default for me, and the one I've been using for a while (https://github.com/Blaisorblade/opam-overlay/tree/master/packages/coq).

However, I/we anticipated this would make it easier to _enable_ native-compute for interested people, so more people could try it out.

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:42):

at least I did not anticipate anything like this, but it should absolutely not be surprising that a not-fully-tested feature is not production-quality when tested at scale on millions of lines of code.

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:45):

TLDR: I suspect this pushback, while unfortunate, is in fact a _success_ of the CEP: it's much easier to try out native-compute and discover if it helps.

view this post on Zulip Paolo Giarrusso (Dec 11 2020 at 04:48):

I think it's a cool feature, and it's cool "the first 90%" (or the POPL 90%) of the work is in, but the engineering/integration is not trivial, and it has been undersestimated for long. Hopefully now it'll be easier to find and address the bugs that pop up, and exploit it where it already works. It might even count as motivation.

view this post on Zulip Christian Doczkal (Dec 11 2020 at 09:57):

Coming to think of it, not testing (and hence not failing due to) coq-native in the CI images may actually help people who want to use native compilation: It means that library maintainers will not be encouraged to disable native compilation and then forget about it, locking their users out of native compilation. And people who really want to use native compute with some library can install coq-native and then tinker around with specific versions of OCaml, custom ulimit, and whatnot. So thanks @Erik Martin-Dorel for the the quick reaction. CI is green again :smile:

view this post on Zulip Pierre Roux (Dec 11 2020 at 10:31):

So trying to sum up:

Do we need to do something about the last point? I see a few potential improvements here:

view this post on Zulip Pierre Roux (Dec 11 2020 at 10:35):

About the last point, a potential implementation would be to have the CI of the OPam archive complain if any package doesn't contain either a conflict clause with coq-native or some meta-information telling the package is somewhat expected to be native-able. The CI could suggest to put the conflict clause if the maintainer doesn't know/want to bother about native. It seems that something similar will be needed anyway for the split approach to know which *-native packages to generate or not.

view this post on Zulip Enrico Tassi (Dec 11 2020 at 10:44):

My plan to mitigate 3 is https://github.com/coq/opam-coq-archive/pull/1504, that is to know in advance which packages are problematic for coq-native and tell their upstream authors to either make these package conflict with coq-native or put, upstream, a native-compiler off option.

Of course it would make sense to coordinate the tests we run on the opam archive with the docker images and the platform installers, so that we ship/advertise what we actually test. I did put an item about this at the next Coq call.

view this post on Zulip Christian Doczkal (Dec 11 2020 at 10:45):

I am against anything that would require package maintainers to deal with coq-native or make any decisions about it. Package maintainers can choose to do so, in order to forestall complaints, but as long as users who install coq-native are informed of the experimental nature of their setup, I see no point in bothering package maintainers, in particular given that changing the ocaml version and various settings may make a library work with coq-native even if it doesn't work reliably with different settings.

view this post on Zulip Enrico Tassi (Dec 11 2020 at 10:49):

The plan (to be executed) is to have a bot make a review that suggests a little patch + conflicts: [ "coq-native" ] when the package you upload is found to not work with coq-native. I'm aware that "work" is not very well defined, hence the coordination I hope to obtain.

Is this plan "too much" from the perspective of a package maintainer publishing on the coq opam archive?

view this post on Zulip Enrico Tassi (Dec 11 2020 at 10:49):

I'm of course not against warning users that install coq-native... on the contrary

view this post on Zulip Christian Doczkal (Dec 11 2020 at 10:58):

Well, I'm relatively relaxed about the bot suggesting a conflicts clause if the library fails to build with coq-native on all versions of ocaml (or a representative subset suggested by the coq developers) with settings that already try to make things work (i.e., by increasing the stack size). What's less clear to me is whether one should add a conflict clause if the package fails to build on some and succeeds to build on other compilers. In any case, it's good if maintainers won't have to change anything in their repositories, so that compatibility with coq-native is always tested when releasing to the archive. Also, would suggest to only add conflicts for the released repository. In summary, I suggest the meaning of conflicts: [ "coq-native"] to be is known to (relibably) fail rather than is not known to reliably work.

view this post on Zulip Enrico Tassi (Dec 11 2020 at 11:22):

Let me add that one really willing to use native with a package that conflicts with coq-native can tell opam to ignore that constraint.
So I'm not super convinced of the meaning you suggest, but well, without deciding what we consider as "supported platforms" it not really obvious to give a meaning to "reliable" either.

Btw, did someone figure out if docker has a constrained stack? (on my PC it's 8K, and it has been like that since I could remember)

view this post on Zulip Théo Zimmermann (Dec 11 2020 at 11:24):

It might be due to the GitHub Actions default settings. I haven't looked into that.

view this post on Zulip Théo Zimmermann (Dec 11 2020 at 11:25):

I see a few potential improvements here:

:+1:

view this post on Zulip Christian Doczkal (Dec 11 2020 at 11:49):

@Enrico Tassi As I said, as long as maintainers are not forced to disable (once and for all) native compilation by putting some option into their _CoqProject file, I'm okay.

view this post on Zulip Karl Palmskog (Dec 11 2020 at 12:11):

@Théo Zimmermann how will native-compile be handled in Nix and the Nix CI? if native conflicts is handled at the opam level, Nix packages will require something separate but analogous, right?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

Not sure the conflict is a good idea per-se, in the sense that 99% of failures of packages with coq-native are going to be due to low stack size.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

So IMHO we need something that makes people installing coq-native to be aware that they needs lots of RAM and stack

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 12:54):

but in principle I wouldn't call that a conflict

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 12:56):

This is not so different from some packages already requiring huge amounts of RAM even without native

view this post on Zulip Théo Zimmermann (Dec 11 2020 at 12:56):

@Karl Palmskog In Nix, we have never set the -native-compiler flag explicitly. Since ondemand is the new default, it means that we preserve backward-compatibility. If there are users that want something similar to coq-native, it will have to be implemented (it's not the case today).

view this post on Zulip Enrico Tassi (Dec 11 2020 at 13:02):

Emilio Jesús Gallego Arias said:

So IMHO we need something that makes people installing coq-native to be aware that they needs lots of RAM and stack

My problem is that I still don't know what you mean by "a lot".
I've just run unimath with coq-native https://gitlab.com/coq/opam-coq-archive/-/jobs/905112076 on a regular runner.

view this post on Zulip Enrico Tassi (Dec 11 2020 at 13:04):

TTBOMK unimath should be the antichrist for coq-native, but ti works...

view this post on Zulip Pierre-Marie Pédrot (Dec 11 2020 at 13:05):

How many GB of RAM do you have?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:08):

There is no clear definition of "a lot", as it depends on the internals of the OCaml compiler, so it is not even stable by versions. For example some code will stress the register allocation, some other will stress the constant propagation. Large modules tend to be a problem too. I'm not sure why Unimath should be bad for coq-native, likely the opposite.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:08):

But indeed code with lots of modules for example tend to make the OCaml compiler go hungry.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:09):

But that could change next version; anyways I'm not trying to figure out the heuristic, but my point is that packages that don't work, they fail because of the memory setup, so that is not an intrinsic conflict, but more of a circumstantial one.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:09):

As stack size is not even consistent among distros of Linux

view this post on Zulip Pierre-Marie Pédrot (Dec 11 2020 at 13:15):

I'm not sure why Unimath should be bad for coq-native, likely the opposite.

Shitload amount of huge Defined proofs.

view this post on Zulip Pierre-Marie Pédrot (Dec 11 2020 at 13:17):

I can't find it again but there was a bench with coq-native on, and unimath was the worst slowdown IIRC.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:17):

Maybe proofs there are already indeed filtered as to be manageable?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:17):

Could be slow but not take a lot of RAM

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:17):

YMMV

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:17):

tho, I'm gonna finish the coqnative patch for Dune, this way Dune will track ocamlc invocations including memory and time

view this post on Zulip Emilio Jesús Gallego Arias (Dec 11 2020 at 13:18):

so we can get that data precisely

view this post on Zulip Pierre-Marie Pédrot (Dec 11 2020 at 13:18):

UniMath is a very good outlier on benches because it's literally the one development that consistently goes against all recommended good practices in Coq dev.

view this post on Zulip Enrico Tassi (Dec 11 2020 at 19:59):

In the past 2 days I've been trying to reproduce these native failures users are experienciong and we could foresee... and I'm failing miserably (which is a good news to me, but I'm frustrated).

I tried to run Coq master on 4.07 and 4.09 with and without flambda with coq-native installed on unimath and fiat-parsers... and it works on a worker with 4G ram and 8K stack. What am I doing wrong? (I'm not using docker, which may be a relevant, and I did not test 4.05).
https://github.com/coq/opam-coq-archive/pull/1504

view this post on Zulip Jason Gross (Dec 16 2020 at 16:07):

I've since disabled the native compiler on fiat-parsers, I can point you at a commit before that if you like.


Last updated: Oct 21 2021 at 22:02 UTC