Hey all,
I'm currently working on forward-porting the CoqGym benchmark suite for new neural proof search research. The suite was originally built for coq 8.9, but my tool supports coq 8.10.0-8.12.2, so I've been going through and trying to get every project building on either 8.10.2 or 8.12.2. For most projects this has been doable, albiet with some new forks of projects that were last developed for 8.9, but one project is still giving me trouble. Coq-library-undecidability is a currently developed project, and master is for 8.15, but the only version on Opam is for 8.12. I've been trying to build it on 8.12.2, and, using the tag that opam source ("v1.0.0+8.12"), it will build fine on my local machine. But when I try to build it on our cluster, it appears to get stuck on several files, hanging indefinitely with no error messages and zero cpu usage. Does anyone have any idea how I could debug this issue?
Thanks!
Wait, is coqc
stuck with 0% CPU usage? Any chance your cluster's launching too many jobs, running out of RAM and swapping? Otherwise it's extremely odd.
My general advice for discrepancies across machines is "do not trust opam to choose versions for Coq packages", but I haven't yet seen opam cause those symptoms...
And even if the stuck processes send no log messages, (full) logs of both builds, together with opam list
outputs, might be worth uploading in case somebody has a clue.
Finally, @Yannick Forster is also on this Zulip, but I'm not sure some Coq code can hang Coq with 0% CPU use. The code uses a custom plugin but... no idea if a typical plugin (or that one) can cause such a hang?
I'm happy to help, but I've never seen a similar problem in this or other projects
Might be worthwhile to try with the tip of the coq-8.12
branch. If an opam package is necessary I could do a PR with a 1.0.1 for 8.12
When I try to build on the tip of the coq-8.12 branch, I get an warning message that the library Metacoq.Template.Checker is missing. But my opam state has both coq-metacoq-template and coq-metacoq-checker installed, not sure what the issue is. Is there another package I need?
The same warning doesn't happen on the tag of 8.12 used for opam, v1.0.0+8.12
Re swapping, I've checked the memory usage and it doesn't seem to be using very much. The cluster node definitely has a bunch of memory to spare, but I tried bumping up the individual job memory limit in case that was the problem, that didn't help.
Re opam packages, I created an opam switch just for the build on both machines, they have identical package versions.
Re logs, I'm running the build single-threaded now to get a full log to post.
Oh this is embarrasing... I'm not sure what I changed but it seems to be building now.
Did that fix the build problem with the branch tip? If not, make sure you're using the right switch and run eval $(opam env)
correctly
No the problem with the branch tip is the same, I checked the switch and ran eval $(opam env --switch=undecidability --set-switch)
https://github.com/uds-psl/coq-library-undecidability/blob/coq-8.12/opam and https://github.com/MetaCoq/metacoq/blob/v1.0-beta2-8.12/template-coq/theories/Checker.v suggest the library should be there in the required metacoq-template version...
Odd, that's the version of the package I have installed, wonder why it's not finding it
Is the .vo file in the expected location? The metacoq project files seem alright... What's the warning text?
Ooooh I think I accidentally had beta1 installed instead of beta2. Since if you try to install coq-metacoq-checker, it'll downgrade coq-metacoq-template to beta1 from beta2, and beta1 doesn't have that file.
easy to miss without using diff
tools :-)… FWIW, using the right opam install
incancation on the opam package in the source would happen to avoid this
alternatively, opam pin
can help prevent opam
from trying to outsmart the user
in opam’s defense, coq-metacoq-checker
seems to not exist in beta2
(maybe because its been merged in template-coq? no clue)
it looks like coq-metacoq-checker
became coq-metacoq-safechecker
in beta2
.
there should maybe be a conflict
declaration between coq-metacoq-checker
and coq-metacoq-safechecker
to shepherd users in the right direction...
I wondered, but there's also a safechecker in beta1?
I guess checker
just gone then
Last updated: Oct 13 2024 at 01:02 UTC