Hello @Enrico Tassi, PR coq/coq#13039 had a failure of the async job with:
Makefile.build:866: recipe for target 'theories/ZArith/Znumtheory.vo' failed make: *** [theories/ZArith/Znumtheory.vo] Segmentation fault (core dumped)
Are you interested in being notified about those failures?
I can't find the value of NJOBS in general. I see bedrocks gets a 1, but what is the default?
My problem is that I cannot reproduce it locally, and I start to think that these core dumps are actually about an out of memory. shared runners have 4G ram and async-proofs on potentially starts 2 coq processes per file with the same memory footprint.
How many processe are we running?
If you go to the top of the log, you get a lot of info.
cpu cores : 1
FWIW, the out-of-memory killer shouldn’t use SIGSEGV — but some code segfaults when allocation fails is very plausible (such bugs are really easy in systems code)
(I remember a few in the GHC runtime system)
but the end result is probably the same — if this were a Coq bug, I don’t expect coq should fix it (if it were in OCaml, OTOH)
As usual re-running the job on pyrolise that has 32G memory made it pass; having 4 coq processes exhaust 4G is plausible.
I'd go for NJOBS=1 (which will also make things more reproducible) but then I guess I should also increase the timeout a little.
And add a call to dmesg on failure if possible (is there a kwd for that?). Are you OK with this plan @Théo Zimmermann ?
Maybe I should just add dmesg to validate the thesis.
Yes, that sounds good.
I believe that
after_script is run in all cases, failure or not.
Let me check.
I've opened a PR
Last updated: Oct 16 2021 at 09:07 UTC