Stream: Dune devs & users

Topic: Issues with building Elpi for Coq Platform on Windows


view this post on Zulip Michael Soegtrop (Aug 16 2022 at 08:03):

I have issues building latest Elpi on Windows. In the latest version of Elpi there was a switch from CamlP5 to Menhir and I get build issues around this: see (https://github.com/coq/platform/runs/7811167660?check_suite_focus=true#step:5:2503).

The strange thing is that it works in local Coq Platform Windows builds, but it fails in Coq Platform CI builds. The only difference should be machine memory size, speed and thread count. In CI (in the above link) I set -j 1 to be sure, but this did not work either. My local builds are on a PC with 128 GB RAM using 8 threads. I am not 100% sure file system case sensitivity is the same between local builds and CI - it should be.

See also the discussion here Zulip Elpi

Any ideas what I could try to nail this down? Should I add some debug options to dune?

This is the main blocker for the Coq Platform 2022.09 (Coq 8.16+rc1) preview and beta release.

view this post on Zulip Enrico Tassi (Aug 16 2022 at 11:41):

Is there a way to print the rules involved in a failing build? It look like a dependency is missing in the rules for the menhir stanza. I hope I did not do anything sylly in there, but I use quite some features to separate the parser into multiple files, maybe not super well tested elsewhere.

view this post on Zulip Paolo Giarrusso (Aug 16 2022 at 11:45):

dune build --display verbose will print quite a bit more info... but before that, have you compared the dune version locally and in CI?

view this post on Zulip Paolo Giarrusso (Aug 16 2022 at 11:49):

@Michael Soegtrop (1) aren't you also blocked on Coq (2) Coq's build fails with the same unhelpful error — No such file or directory

view this post on Zulip Michael Soegtrop (Aug 16 2022 at 11:59):

@Paolo Giarrusso : ah yes I didn't see it - it seems to be new. I had a few other runs were just elpi failed (e.g. https://github.com/coq/platform/runs/7787507196?check_suite_focus=true).

I will enable --display verbose.

Regarding the version: I don't pin dune (maybe I should) but unless there was a new dune opam package on the day I did the runs, it is the same. Everything is setup from scratch (even cygwin, which is in a sense the operating system on which the build runs). Also failing CI runs time wise overlap the successful local builds.

But I can print the dune version and/or pin it.

view this post on Zulip Michael Soegtrop (Aug 16 2022 at 14:12):

@Paolo Giarrusso @Enrico Tassi : you can find a build with "--display verbose" here: (https://github.com/coq/platform/runs/7858222720?check_suite_focus=true#step:5:2505).

view this post on Zulip Enrico Tassi (Aug 16 2022 at 18:40):

compiling coq:

# Error: I/O error: plugins/syntax/.number_string_notation_plugin.objs/byte\number_string_notation_plugin__String_notation.cmi55977d.tmp: No such file or directory

the "\n" is very fishy

compiling Elpi:

   290  Error:
   291  _build/.sandbox/6032c84e8de20f8626a867714fe51e63/default/src/parser/.elpi_lexer_config.objs/byte/elpi_lexer_config__Lexer_config.cmi:
   292  No such file or directory
   293  -> required by _build/default/src/parser/grammar__mock.mli.inferred
   294  -> required by alias src/parser/all

clearly "grammar_mock" is something generated behind the scenes, and the missing dep is, wild guess,
https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/src/parser/dune#L26
where grammar is compiled. I think --external-tokens Elpi_lexer_config.Tokens does not add a dependency on Elpi_lexer_config.

I've no idea how to add this dep from my side, but here Dune experts can surely help.

view this post on Zulip Michael Soegtrop (Aug 17 2022 at 06:33):

Any idea why this is different on Linux Mac and even Windows CI and Windows local?

view this post on Zulip Michael Soegtrop (Aug 17 2022 at 06:49):

@Enrico Tassi : where did you see a \n in the above error - did you download the raw log and the filename contains a \n in there? It is not visible in the HTML processed logs I see.

A note at the dune team: the Coq Windows build uses cygwin as host, but it is a MinGW cross build, so that the result (used by the installer) runs without cygwin. This means that line endings are windows \r\n which can lead to issues if the output of e.g. coqc -where is inspected. Dune should be fine since dune itself is a MinGW app (unless it uses binary mode to read text files), but every shell command like sed is a bit dangerous. But then afaik, dune doesn't use much of that ...

view this post on Zulip Enrico Tassi (Aug 17 2022 at 07:59):

Here: objs/byte\number_string

view this post on Zulip Enrico Tassi (Aug 17 2022 at 07:59):

Not the raw one, I did copy paste what I saw on the GH rendering

view this post on Zulip Enrico Tassi (Aug 17 2022 at 08:02):

My only guess is that targets are, even in j1, somehow sorted in a total way (compatible with topological sorting given by the deps). Maybe that order is not super stable, eg based on full paths (I guess you compile in /home/msoegtrop or something, not /runner...), or based on a pseudo randomized hash table... a bit science fiction, but I cannot think at anything else.

view this post on Zulip Enrico Tassi (Aug 17 2022 at 08:04):

The \n could just be a silly setup. I don't have the time myself to check if the platform GH action does stuff like https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/.github/workflows/main.yml#L46 or https://github.com/LPCIC/elpi/blob/cc3ddcb19a149c43df2d0910f4cbbffd8ddaa8e4/.github/workflows/main.yml#L88

view this post on Zulip Enrico Tassi (Aug 17 2022 at 08:12):

Note that the string "I/O error: %s" comes from ocamlc, not dune directly.

view this post on Zulip Michael Soegtrop (Aug 17 2022 at 16:40):

@Enrico Tassi : mixing forward and backward slashes is quite normal in MinGW builds and rarely causes problems. Windows can natively handle such paths (the file API accepts both / and \ as path separator - just CMD doesn't). But Linux shell expansion would treat \ n as just n.

view this post on Zulip Michael Soegtrop (Aug 17 2022 at 16:41):

And shell usually does not treat \n as lf - only certain commands like tr do. So the \ issue would be for any path, not just for those having components starting with n.

view this post on Zulip Enrico Tassi (Aug 18 2022 at 06:20):

Well, I trust you. But then you have to come up with another theory on why coq is failing ;-)

view this post on Zulip Michael Soegtrop (Aug 19 2022 at 07:32):

It would be good to hear from the dune team if they do anything which does any form of backslash substitution in paths.

view this post on Zulip Ali Caglayan (Aug 23 2022 at 19:21):

I'm not 100% sure, but I think Dune tries to preserve backslashes when they are used. cc @Rudi Grinberg


Last updated: Mar 28 2024 at 08:01 UTC