So a bunch of MathComp-based projects like RegLang and fourcolor that use MathComp Docker just got some strange CI failures with the dev
Docker image:
The following actions will be performed:
- recompile coqide-server dev [upstream or system changes]
- recompile coq dev* [uses coqide-server]
- recompile coq-mathcomp-ssreflect dev* [uses coq]
- recompile coq-bignums dev [uses coq]
- recompile coq-mathcomp-fingroup dev* [uses coq-mathcomp-ssreflect]
- recompile coq-mathcomp-algebra dev* [uses coq-mathcomp-fingroup]
- recompile coq-mathcomp-solvable dev* [uses coq-mathcomp-algebra]
- recompile coq-mathcomp-field dev* [uses coq-mathcomp-solvable]
- recompile coq-mathcomp-character dev* [uses coq-mathcomp-field]
===== 9 to recompile =====
<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><><><>
-> retrieved coqide-server.dev (no changes)
-> retrieved coq-bignums.dev (no changes)
Error: The compilation of coqide-server.dev failed at "dune build -p coqide-server -j 2 @install".
#=== ERROR while compiling coqide-server.dev ==================================#
# context 2.1.3 | linux/x86_64 | ocaml-option-flambda.1 ocaml-variants.4.13.1+options | https://coq.inria.fr/opam/core-dev#2022-10-03 02:50
# path ~/.opam/4.13.1+flambda/.opam-switch/build/coqide-server.dev
# command ~/.opam/4.13.1+flambda/bin/dune build -p coqide-server -j 2 @install
# exit-code 1
# env-file ~/.opam/log/coqide-server-749-d1483e.env
# output-file ~/.opam/log/coqide-server-749-d1483e.out
### output ###
# (cd _build/default && /home/coq/.opam/4.13.1+flambda/bin/ocamlc.opt -w -40 -rectypes -g -bin-annot -I ide/coqide/.idetop.eobjs/byte -I /home/coq/.opam/4.13.1+flambda/lib/coq-core/boot -I /home/coq/.opam/4.13.1+flambda/lib/coq-core/clib -I /home/coq/.opam/4.13.1+flambda/lib/coq-core/config -I /home/coq/.opam/4.13.1+flambda/lib/coq-core/engine -I /home/coq/.opam/4.13.1+flambda/lib/coq-core/gram[...]
# File "ide/coqide/idetop.ml", line 205, characters 33-54:
# 205 | let (hyps_pp_list, concl_pp) = Proof_diffs.diff_goal ~short ?og_s (Proof_diffs.make_goal env nsigma ng) in
# ^^^^^^^^^^^^^^^^^^^^^
# Error: This function has type
# ?og_s:Proof_diffs.goal -> Proof_diffs.goal -> Pp.t list * Pp.t
# It is applied to too many arguments; maybe you forgot a `;'.
Already at the first install command, this is weird, since the Docker image should not re-install Coq and MathComp. But it gets weirder with coqide-server package erroring out. Have I missed some change to this Docker image @Erik Martin-Dorel?
one wild hypothesis: the OCaml version recently became too old?
OK, I just confirmed this is a general problem with coqide-server.dev
, but something is still weird if the package is getting rebuilt during a CI job in the MathComp Docker...
Thanks @Karl Palmskog for the ping!
The line recompile coqide-server dev [upstream or system changes]
tells everything we need to know:
some upstream change occurred, so opam can't help rebuilding the package and all reverse dependencies.
To avoid this, I just need to recompile the base image… can you please give me the offending Docker image name?
(or even better, a link to the logs)
https://github.com/coq-community/reglang/actions/runs/3171715874
build (mathcomp/mathcomp-dev:coq-dev)
- but note that this will likely happen to coqorg/coq.dev
also
see also this topic.
OK thanks a lot!
So, this really looks like a genuine packaging problem, isn't it?
I mean, the conjecture of my last comment does not seem to apply.
I'll try to have a look tonight or tomorrow, but let me know if this is more urgent.
@Erik Martin-Dorel I think this is a transient problem, if you just recompile the base images of mathcomp.dev
and coq.dev
everything should be good going forward
the root cause is https://github.com/coq/opam-coq-archive/pull/2331 and https://github.com/coq/coq/pull/16549 which modify coq.dev
to include coqide-server.dev
to preserve behavior of 8.16 after package split.
Thanks, but why recompiling the images would be enough? because coq.dev and mathcomp.dev are precisely rebuilt at each merge of any PR in coq.master
Sorry anyway, I have some meeting slot from now on
so what does "rebuild" mean here? If you do opam update coq
or similar, that apparently won't work due to opam scheduling rebuilds in the wrong order.
OK, here's a theory: the problem occurs because we don't trigger a Docker rebuild for when the packages in core-dev
opam repo are updated.
So the Coq PR triggers a rebuild, but the change there was to the packages in the Coq GitHub repo, which are not used in the Docker images (I think). But then the coq.dev
package in the Coq opam repo is updated later, and no Docker rebuild is done, leading to the CI errors.
Conclusion: things should work after manual rebuild.
Actually, the Docker image uses the opam
definition from the repository. See: https://github.com/coq-community/docker-coq/blob/master/coq/dev/Dockerfile
I have another explanation to propose: this Docker image explicitly pins coq-core, coq-stdlib and coq, but it doesn't do the same for coqide-server (it didn't use to be needed). Because coqide-server is not pinned, opam may decide that it actually needs the package from core-dev and not the package from the GitHub repo later on when something that depends (indirectly) on coqide-server is built...
If my explanation is correct, then rebuilding should not solve the issue and this PR https://github.com/coq-community/docker-coq/pull/52 is needed.
I'm still getting the coqide-server
-related build failures with MathComp dev
Docker image: https://github.com/coq-community/reglang/actions/runs/3187460415
Has the PR been applied to these images?
Karl Palmskog said:
I'm still getting the
coqide-server
-related build failures with MathCompdev
Docker image: https://github.com/coq-community/reglang/actions/runs/3187460415Has the PR been applied to these images?
Actually we cannot, because after only 3 days, we have 0 CI/CD minutes in math-comp :-/
maybe the core team could lend a runner for this? Otherwise, we'll have to disable using this image everywhere, big loss for many MathComp projects, not least in Coq-community.
(we can of course use the coq.dev
image, but then the Actions CI will take a ton longer)
can the core team and/or MathComp devs make a recommendation here for the short term? Should we assume the mathcomp.dev
Docker image is not coming back for a while?
If the math-comp devs don't disagree (Cc @Cyril Cohen @Pierre Roux FYI), I will remove the stalled dev
images (given that having an obsolete dev image is maybe more problematic than having no image: for third-party pipelines, better fail in an expected way than succeed in an unexpected way).
OK, in that case I will already now disable the dev
image for RegLang and fourcolor
Karl Palmskog said:
Should we assume the
mathcomp.dev
Docker image is not coming back for a while?
I guess they should be back in November 1st, if we manage to strengthen PR https://github.com/math-comp/math-comp/pull/924 enough (at first sight, just by disabling pipelines for reverse dependencies)
Maybe this point could be added to the agenda of the next math-comp meeting? (on October 19, IINM)
Or maybe beforehand, if (as you suggest @Karl Palmskog) one can use a dedicated runner for this (maybe an inria runner?
but I have no clue at all about the feasibility of this)
would be best if someone could check with Maxime, who added the Inria runners for Coq
Yes @Maxime Dénès can comment, but there are chances that such runners won't work for building Docker images just yet.
However, what about a new temporary GitLab organization just for building these images (and not for the MathComp CI)?
Sorry, can I get a TL;DR ? :) what's the question?
@Maxime Dénès the MathComp GitLab organization has run out of CI minutes (like Coq's but even quicker). This means we can't build even MathComp Docker dev
images until November 1. However, lots of projects use the MathComp Docker dev
images, so it would be nice if we could use Inria runners to build those Docker dev
images before Nov 1. Is this technically and/or administratively feasible?
even worse than "the MathComp dev
Docker image is not refreshed every day" is that the image is currently broken due to changes in opam packages. So people can't even use an old stale dev
image.
I should point out that MathComp projects can definitely make due with the regular coq.dev
Docker image, but this of course leads to using more CI minutes (but usually with GitHub Actions). So I guess this stuff is not as urgent as the Coq CI stuff, but good to know for MathComp project maintainers what to do.
@Erik Martin-Dorel we can talk later today if you like, to see if we can find a solution.
One thing I didn't catch is: what consumes so many minutes on mathcomp? It has a very low commit rate, doesn't it?
I think MathComp's reverse dependency CI setup is second to none in the Coq world outside of Coq's CI (lots of projects that depend on MC are tested for every commit)
Maxime Dénès said:
Erik Martin-Dorel we can talk later today if you like, to see if we can find a solution.
yes, thanks! → could I call you at 16:00 or at 17:45 today ? (or tomorrow Friday afternoon)
The allowance of minutes allocated to MathComp is also much lower than the one of Coq because they have not been accepted in the Open Source program (for licensing reasons).
Erik Martin-Dorel said:
If the math-comp devs don't disagree (Cc Cyril Cohen Pierre Roux FYI), I will remove the stalled
dev
images (given that having an obsolete dev image is maybe more problematic than having no image: for third-party pipelines, better fail in an expected way than succeed in an unexpected way).
Done. Now, docker pull mathcomp/mathcomp-dev:coq-8.16
(for instance) immediately fails.
I added a NOTE in the description of both https://hub.docker.com/r/mathcomp/mathcomp-dev and https://github.com/math-comp/docker-mathcomp pages.
FYI @Karl Palmskog @Théo Zimmermann @Pierre Roux @Cyril Cohen
we just got a call with Maxime and he'll be able to help setting up new gitlab runners,
so if everything goes smoothly we should have some auto-deployment of mathcomp-dev/coq-dev working anew by Monday.
I will post here to let you know in this case, stay tuned.
@Erik Martin-Dorel @Cyril Cohen Can you make me an owner of https://gitlab.com/math-comp temporarily? I need it to configure group workers. Thanks!
Maxime Dénès said:
Erik Martin-Dorel Cyril Cohen Can you make me an owner of https://gitlab.com/math-comp temporarily? I need it to configure group workers. Thanks!
@Maxime Dénès done.
Meanwhile (as we discussed this afternoon) I disabled the Developer permissions of coqbot (upon further notice → Guest) in the math-comp and docker-mathcomp projects.
Let me know in a PM when the new runners you are setting up are available; I'll work on the .gitlab-ci.yml updates at that time.
Thanks again!
Erik Martin-Dorel said:
Meanwhile (as we discussed this afternoon) I disabled the Developer permissions of coqbot (upon further notice → Guest) in the math-comp and docker-mathcomp projects.
Why not disable the coqbot-app on GitHub instead? Here your solution means that coqbot is doing the work but getting its push rejected at the last step. Unless you need it for other purposes?
Théo Zimmermann said:
Erik Martin-Dorel said:
Meanwhile (as we discussed this afternoon) I disabled the Developer permissions of coqbot (upon further notice → Guest) in the math-comp and docker-mathcomp projects.
Why not disable the coqbot-app on GitHub instead? Here your solution means that coqbot is doing the work but getting its push rejected at the last step. Unless you need it for other purposes?
Hi @Théo Zimmermann, yes, maybe it's suboptimal? (the idea was to use coqbot just for commit status from gitlab to github… and I don't know if there's an option to configure coqbot to this aim, i.e., disabling the auto-push from PRs to pr-* branches)
Ah OK, I get it. No, there is no option for disabling just the push for now, so what you did sounds OK.
@Erik Martin-Dorel I added new runners for math-comp, with the same orchestration as the Coq ones, and the proper sandboxing that comes with it. Unfortunately, it needs a bit more work to be able to build docker images. But you can already restore the coqbot behavior, it is fine to run any job.
Ok, it should now work also for building docker images.
Sorry @Maxime Dénès @Théo Zimmermann I believed I'd have ~1h of free time on this Monday to tweak/fix the gitlab-ci.yml config using Maxime's new runners, but this was not the case at all; I should have more time soon though.
You don't need to, it already works ;) The only remaining step is to restore coqbot behavior.
Maybe someone else with admin rights on the Math comp GitLab organization can do it?
I'm not sure I follow you. @Maxime Dénès precisely told me that the runners shouldn't run untrusted code.
But by enabling coqbot as is (without modifying the .gitlab-ci.yml), the runners would run arbitrary code from PRs…
This is not the case anymore.
It's completely fine to just re-enable coqbot now.
This is what Maxime wrote above:
I added new runners for math-comp, with the same orchestration as the Coq ones, and the proper sandboxing that comes with it.
[...]
The only remaining step is to restore coqbot behavior.
Indeed @Erik Martin-Dorel this constraint is now removed.
Ah OK, Great! :+1: So now the coqbot config is restored. Thanks again, @Maxime Dénès and @Théo Zimmermann
are all MathComp Docker images considered back in action now?
AFAICT @Erik Martin-Dorel has reactivated the coqbot config but not the Docker dev builds.
Last updated: Oct 13 2024 at 01:02 UTC