Stream: Coq devs & plugin devs

Topic: Directory handling code in coqdep


view this post on Zulip Ali Caglayan (Oct 06 2021 at 10:46):

Since #14059 was merged, I wonder how close we are to fixing some coqdep bugs like the following:

cc @Emilio Jesús Gallego Arias

view this post on Zulip Hugo Herbelin (Oct 06 2021 at 15:39):

Ali Caglayan said:

Indeed #14718 fixes both. The status of #14718 is that coqdep is intrinsically dependent on the order in which Unix's stat returns the contents of a directory. So:

Standardizing an order on file names is not so easy. For instance, on the MacOS (old HFS+, but maybe also the new APFS, it is not fully clear to me), file names are subject to a UTF8 normalization, so to have an order which is the same on MacOS and on system without normalization such as Unix (if I'm correct), we need to normalize ourselves. This also probably means adding a dependency on a unicode normalization library (such as uunf.

In the short term, we could remove the tests that depend on the filesystem order and fails and merge, leaving the normalization question for future work.

This is one is probably just about adding

  | "(*"
      { comment lexbuf; comment lexbuf }

in skip_to_dot in tools/coqdep_lexer.mll.

view this post on Zulip Hugo Herbelin (Oct 06 2021 at 15:43):

This is one is probably just about adding
| "(*" { comment lexbuf; comment lexbuf }
in skip_to_dot in tools/coqdep_lexer.mll.

I can confirm that it is enough. Going to submit a PR.

view this post on Zulip Hugo Herbelin (Oct 06 2021 at 15:58):

This is https://github.com/coq/coq/pull/14996.

view this post on Zulip Ali Caglayan (Oct 06 2021 at 16:11):

You mention that coqdep doesn't currently impose an order of compilation to coqc. If we were to do this, would this solve the normalization issue since they would always be in agreement?

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 01:35):

(deleted)

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 01:36):

(deleted)

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 01:45):

Correcting previous comments, I take it that the filename grammar is basically https://coq.inria.fr/refman/language/core/basic.html#grammar-token-ident ? How often do people use non-ASCII letters in the wild?

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 01:55):

This is probably worth supporting, and it impacts most non-English speaking communities, but I'm not sure if it's a priority. OTOH, is the goal to sort names "alphabetically"? Is normalization enough or would you need https://www.unicode.org/reports/tr10/ to handle this correctly? I hope not, but if the order is exposed to users it'd matter. It might be better to give errors when the order would matter, and ask the user to disambiguate their imports, instead of dealing with ordering.

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 07:52):

Chinese users gladly use ideograms in identifiers, and probably so in filenames.

It might be better to give errors when the order would matter, and ask the user to disambiguate their imports, instead of dealing with ordering.

Yes, giving a warning is basically the current situation and this seems a reasonable choice. We could even produce errors indeed.

Is normalization enough

In case we want a deterministic (not necessarily alphabetic) order rather than a warning, normalization would be enough afaiu.

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 07:53):

Ali Caglayan said:

You mention that coqdep doesn't currently impose an order of compilation to coqc. If we were to do this, would this solve the normalization issue since they would always be in agreement?

I don't think so: without an order the output of coqdep would still be file-system dependent.

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 07:56):

Another alternative would be force a copy of the filename in the file itself. Then we could use this copy of the filename for sorting. But this requires a change of the format of v files. (Or, at worst, we could make it optional, say an optional Library Foo. header, or so.)

view this post on Zulip Ali Caglayan (Oct 07 2021 at 10:10):

Can I get some higher level understanding of what is going on? If we have Foo.v in a project, are you saying that coqdep reads the name differently in MacOS and Linux due to quirks with the file system?

view this post on Zulip Ali Caglayan (Oct 07 2021 at 10:11):

Or is this due to unicode in file names? Can we not just disallow that? We already have problems with files like _foo.v in CoqIDE for exaple AFAIK.

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 10:28):

Yes, it is about Unicode. In a file like Voilà.v, MacOS X encodes à with a combining diacritic even if you originally used a proper à.

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 10:31):

A way to discriminate further without ordering the file names is to order on the length of the missing segment. E.g. if you ask for file from foo and have files bound to foo.bar.file and foo.file in the file system, we give priority to the latter which has a shorter implicit path (here an empty one). We would still have warnings when in the presence of foo.bar.file and foo.bar'.file.

view this post on Zulip Gaëtan Gilbert (Oct 07 2021 at 11:20):

A way to discriminate further without ordering the file names is to order on the length of the missing segment

sounds worth trying

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 13:17):

is there any “correct” code that relies on ordering? I don’t think that can happen if your requires are unambiguous, right?

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 13:19):

could one have a make check-path target (or similar) asking coqdep to perform _exhaustive_ search and warn if any require has multiple valid destinations? Ideally that’d happen every time we run coqdep, unless that’s too slow.

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 13:23):

and I guess that would be too slow once you consider compositional dune builds across whole ecosystem chunks — running coqdep at once on the whole math-comp ecosystem up to the odd order theorem, for instance, seems potentially useful (I’m sure it is for the stdpp-iris ecosystem)

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 17:15):

Paolo Giarrusso said:

is there any “correct” code that relies on ordering?

While testing #14718, I've seen files of same name under the same -Q (in hierarchy-builder, but I vaguely remember a similar situation in another CI-tests development).

I don’t think that can happen if your requires are unambiguous, right?

There was a Coq-club discussion at some time reporting about the ambiguity of Require. I realize that adopting the convention to give preference to the shorterst hidden segment would solve it (since there is at most one implicit segment of length 0).

view this post on Zulip Hugo Herbelin (Oct 07 2021 at 17:17):

Paolo Giarrusso said:

could one have a make check-path target (or similar) asking coqdep to perform _exhaustive_ search and warn if any require has multiple valid destinations? Ideally that’d happen every time we run coqdep, unless that’s too slow.

I think the coqdep warnings are already triggered exhaustively for all ambiguous Require.

view this post on Zulip Emilio Jesús Gallego Arias (Oct 07 2021 at 18:29):

IMHO we should be independent from the order, meaning that if the user specifies an ambigous import we should error [after a transition phase]

view this post on Zulip Emilio Jesús Gallego Arias (Oct 07 2021 at 18:30):

@Ali Caglayan regarding your question, IMHO the first step is define a library coq.loadpath that would unify the loadpath API; a problem is that DirPath.t is in the kernel, and you don't it here, however the surgery is not trivial as DirPath itself depends on Id.t which has some heavy stuff.

view this post on Zulip Emilio Jesús Gallego Arias (Oct 07 2021 at 18:32):

I wonder what would be the best way to fix that, but IMHO even a PR that unfies both even at the cost of having coqdep depend on coq.kernel would be very awesome; we could surely remove that dependency as an additional commit later on; there is also some ideas by @Gaëtan Gilbert in a Zulip thread some time ago , see https://coq.zulipchat.com/#narrow/stream/237656-Coq-devs.20.26.20plugin.20devs/topic/LoadPath.20.20library.20organization , so IMHO we can do it.

I'll cook a quick PR

view this post on Zulip Emilio Jesús Gallego Arias (Oct 07 2021 at 18:56):

https://github.com/coq/coq/pull/15002 is a POC, but needs quite a bit of work

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 22:09):

While testing #14718, I've seen files of same name under the same -Q (in hierarchy-builder, but I vaguely remember a similar situation in another CI-tests development).

having a.b.foo and a.c.foo does not mean “rely on ordering” if the user never uses From a Require foo. If they have a.b.foo twice, that’s worth reporting and fixing.

view this post on Zulip Paolo Giarrusso (Oct 07 2021 at 22:16):

and looking at the coq-club thread, I wonder if, instead of From’s semantics, one could add adapter modules with shorter names to the stdlib. Say, have Coq.List re-export Coq.Lists.List (or viceversa). @Hugo Herbelin , this might not even introduce ambiguity under your shortest-path idea. OTOH, it might be too late now.

view this post on Zulip Hugo Herbelin (Oct 08 2021 at 10:35):

Paolo Giarrusso said:

One could add adapter modules with shorter names to the stdlib. Say, have Coq.List re-export Coq.Lists.List (or viceversa). Hugo Herbelin , this might not even introduce ambiguity under your shortest-path idea. OTOH, it might be too late now.

I like the idea. No need to remove From actually.


Last updated: Feb 05 2023 at 19:29 UTC