I have issues building latest Elpi on Windows. In the latest version of Elpi there was a switch from CamlP5 to Menhir and I get build issues around this: see (https://github.com/coq/platform/runs/7811167660?check_suite_focus=true#step:5:2503).
The strange thing is that it works in local Coq Platform Windows builds, but it fails in Coq Platform CI builds. The only difference should be machine memory size, speed and thread count. In CI (in the above link) I set -j 1
to be sure, but this did not work either. My local builds are on a PC with 128 GB RAM using 8 threads. I am not 100% sure file system case sensitivity is the same between local builds and CI - it should be.
See also the discussion here Zulip Elpi
Any ideas what I could try to nail this down? Should I add some debug options to dune?
This is the main blocker for the Coq Platform 2022.09 (Coq 8.16+rc1) preview and beta release.
Is there a way to print the rules involved in a failing build? It look like a dependency is missing in the rules for the menhir stanza. I hope I did not do anything sylly in there, but I use quite some features to separate the parser into multiple files, maybe not super well tested elsewhere.
dune build --display verbose
will print quite a bit more info... but before that, have you compared the dune
version locally and in CI?
@Michael Soegtrop (1) aren't you also blocked on Coq (2) Coq's build fails with the same unhelpful error — No such file or directory
@Paolo Giarrusso : ah yes I didn't see it - it seems to be new. I had a few other runs were just elpi failed (e.g. https://github.com/coq/platform/runs/7787507196?check_suite_focus=true).
I will enable --display verbose.
Regarding the version: I don't pin dune (maybe I should) but unless there was a new dune opam package on the day I did the runs, it is the same. Everything is setup from scratch (even cygwin, which is in a sense the operating system on which the build runs). Also failing CI runs time wise overlap the successful local builds.
But I can print the dune version and/or pin it.
@Paolo Giarrusso @Enrico Tassi : you can find a build with "--display verbose" here: (https://github.com/coq/platform/runs/7858222720?check_suite_focus=true#step:5:2505).
compiling coq:
# Error: I/O error: plugins/syntax/.number_string_notation_plugin.objs/byte\number_string_notation_plugin__String_notation.cmi55977d.tmp: No such file or directory
the "\n" is very fishy
compiling Elpi:
290 Error:
291 _build/.sandbox/6032c84e8de20f8626a867714fe51e63/default/src/parser/.elpi_lexer_config.objs/byte/elpi_lexer_config__Lexer_config.cmi:
292 No such file or directory
293 -> required by _build/default/src/parser/grammar__mock.mli.inferred
294 -> required by alias src/parser/all
clearly "grammar_mock" is something generated behind the scenes, and the missing dep is, wild guess,
https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/src/parser/dune#L26
where grammar
is compiled. I think --external-tokens Elpi_lexer_config.Tokens
does not add a dependency on Elpi_lexer_config
.
I've no idea how to add this dep from my side, but here Dune experts can surely help.
Any idea why this is different on Linux Mac and even Windows CI and Windows local?
@Enrico Tassi : where did you see a \n
in the above error - did you download the raw log and the filename contains a \n in there? It is not visible in the HTML processed logs I see.
A note at the dune team: the Coq Windows build uses cygwin as host, but it is a MinGW cross build, so that the result (used by the installer) runs without cygwin. This means that line endings are windows \r\n
which can lead to issues if the output of e.g. coqc -where
is inspected. Dune should be fine since dune itself is a MinGW app (unless it uses binary mode to read text files), but every shell command like sed is a bit dangerous. But then afaik, dune doesn't use much of that ...
Here: objs/byte\number_string
Not the raw one, I did copy paste what I saw on the GH rendering
My only guess is that targets are, even in j1, somehow sorted in a total way (compatible with topological sorting given by the deps). Maybe that order is not super stable, eg based on full paths (I guess you compile in /home/msoegtrop or something, not /runner...), or based on a pseudo randomized hash table... a bit science fiction, but I cannot think at anything else.
The \n could just be a silly setup. I don't have the time myself to check if the platform GH action does stuff like https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/.github/workflows/main.yml#L46 or https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/.github/workflows/main.yml#L88
Note that the string "I/O error: %s"
comes from ocamlc
, not dune directly.
@Enrico Tassi : mixing forward and backward slashes is quite normal in MinGW builds and rarely causes problems. Windows can natively handle such paths (the file API accepts both / and \ as path separator - just CMD doesn't). But Linux shell expansion would treat \ n
as just n.
And shell usually does not treat \n as lf - only certain commands like tr do. So the \ issue would be for any path, not just for those having components starting with n.
Well, I trust you. But then you have to come up with another theory on why coq is failing ;-)
It would be good to hear from the dune team if they do anything which does any form of backslash substitution in paths.
I'm not 100% sure, but I think Dune tries to preserve backslashes when they are used. cc @Rudi Grinberg
Last updated: Jun 03 2023 at 18:01 UTC