Since #14059 was merged, I wonder how close we are to fixing some coqdep bugs like the following:
From
From
cc @Emilio Jesús Gallego Arias
Ali Caglayan said:
- coqdep ignoring
From
- https://github.com/coq/coq/issues/14539
- https://github.com/coq/coq/issues/11631
- It seems that https://github.com/coq/coq/pull/14718 is aiming to fix this, but progress looks to have stalled.
Indeed #14718 fixes both. The status of #14718 is that coqdep
is intrinsically dependent on the order in which Unix's stat
returns the contents of a directory. So:
Standardizing an order on file names is not so easy. For instance, on the MacOS (old HFS+, but maybe also the new APFS, it is not fully clear to me), file names are subject to a UTF8 normalization, so to have an order which is the same on MacOS and on system without normalization such as Unix (if I'm correct), we need to normalize ourselves. This also probably means adding a dependency on a unicode normalization library (such as uunf.
In the short term, we could remove the tests that depend on the filesystem order and fails and merge, leaving the normalization question for future work.
- coqdep getting confused by comments with
From
This is one is probably just about adding
| "(*"
{ comment lexbuf; comment lexbuf }
in skip_to_dot
in tools/coqdep_lexer.mll
.
This is one is probably just about adding
| "(*" { comment lexbuf; comment lexbuf }
inskip_to_dot
intools/coqdep_lexer.mll
.
I can confirm that it is enough. Going to submit a PR.
This is https://github.com/coq/coq/pull/14996.
You mention that coqdep doesn't currently impose an order of compilation to coqc. If we were to do this, would this solve the normalization issue since they would always be in agreement?
(deleted)
(deleted)
Correcting previous comments, I take it that the filename grammar is basically https://coq.inria.fr/refman/language/core/basic.html#grammar-token-ident ? How often do people use non-ASCII letters in the wild?
This is probably worth supporting, and it impacts most non-English speaking communities, but I'm not sure if it's a priority. OTOH, is the goal to sort names "alphabetically"? Is normalization enough or would you need https://www.unicode.org/reports/tr10/ to handle this correctly? I hope not, but if the order is exposed to users it'd matter. It might be better to give errors when the order would matter, and ask the user to disambiguate their imports, instead of dealing with ordering.
Chinese users gladly use ideograms in identifiers, and probably so in filenames.
It might be better to give errors when the order would matter, and ask the user to disambiguate their imports, instead of dealing with ordering.
Yes, giving a warning is basically the current situation and this seems a reasonable choice. We could even produce errors indeed.
Is normalization enough
In case we want a deterministic (not necessarily alphabetic) order rather than a warning, normalization would be enough afaiu.
Ali Caglayan said:
You mention that coqdep doesn't currently impose an order of compilation to coqc. If we were to do this, would this solve the normalization issue since they would always be in agreement?
I don't think so: without an order the output of coqdep
would still be file-system dependent.
Another alternative would be force a copy of the filename in the file itself. Then we could use this copy of the filename for sorting. But this requires a change of the format of v files. (Or, at worst, we could make it optional, say an optional Library Foo.
header, or so.)
Can I get some higher level understanding of what is going on? If we have Foo.v
in a project, are you saying that coqdep reads the name differently in MacOS and Linux due to quirks with the file system?
Or is this due to unicode in file names? Can we not just disallow that? We already have problems with files like _foo.v
in CoqIDE for exaple AFAIK.
Yes, it is about Unicode. In a file like Voilà.v
, MacOS X encodes à
with a combining diacritic even if you originally used a proper à
.
A way to discriminate further without ordering the file names is to order on the length of the missing segment. E.g. if you ask for file
from foo
and have files bound to foo.bar.file
and foo.file
in the file system, we give priority to the latter which has a shorter implicit path (here an empty one). We would still have warnings when in the presence of foo.bar.file
and foo.bar'.file
.
A way to discriminate further without ordering the file names is to order on the length of the missing segment
sounds worth trying
is there any “correct” code that relies on ordering? I don’t think that can happen if your requires are unambiguous, right?
could one have a make check-path
target (or similar) asking coqdep to perform _exhaustive_ search and warn if any require has multiple valid destinations? Ideally that’d happen every time we run coqdep, unless that’s too slow.
and I guess that would be too slow once you consider compositional dune builds across whole ecosystem chunks — running coqdep at once on the whole math-comp ecosystem up to the odd order theorem, for instance, seems potentially useful (I’m sure it is for the stdpp-iris ecosystem)
Paolo Giarrusso said:
is there any “correct” code that relies on ordering?
While testing #14718, I've seen files of same name under the same -Q
(in hierarchy-builder, but I vaguely remember a similar situation in another CI-tests development).
I don’t think that can happen if your requires are unambiguous, right?
There was a Coq-club discussion at some time reporting about the ambiguity of Require
. I realize that adopting the convention to give preference to the shorterst hidden segment would solve it (since there is at most one implicit segment of length 0).
Paolo Giarrusso said:
could one have a
make check-path
target (or similar) asking coqdep to perform _exhaustive_ search and warn if any require has multiple valid destinations? Ideally that’d happen every time we run coqdep, unless that’s too slow.
I think the coqdep warnings are already triggered exhaustively for all ambiguous Require
.
IMHO we should be independent from the order, meaning that if the user specifies an ambigous import we should error [after a transition phase]
@Ali Caglayan regarding your question, IMHO the first step is define a library coq.loadpath
that would unify the loadpath API; a problem is that DirPath.t
is in the kernel, and you don't it here, however the surgery is not trivial as DirPath
itself depends on Id.t
which has some heavy stuff.
I wonder what would be the best way to fix that, but IMHO even a PR that unfies both even at the cost of having coqdep
depend on coq.kernel
would be very awesome; we could surely remove that dependency as an additional commit later on; there is also some ideas by @Gaëtan Gilbert in a Zulip thread some time ago , see https://coq.zulipchat.com/#narrow/stream/237656-Coq-devs.20.26.20plugin.20devs/topic/LoadPath.20.20library.20organization , so IMHO we can do it.
I'll cook a quick PR
https://github.com/coq/coq/pull/15002 is a POC, but needs quite a bit of work
While testing #14718, I've seen files of same name under the same
-Q
(in hierarchy-builder, but I vaguely remember a similar situation in another CI-tests development).
having a.b.foo
and a.c.foo
does not mean “rely on ordering” if the user never uses From a Require foo
. If they have a.b.foo
twice, that’s worth reporting and fixing.
and looking at the coq-club thread, I wonder if, instead of From
’s semantics, one could add adapter modules with shorter names to the stdlib. Say, have Coq.List
re-export Coq.Lists.List
(or viceversa). @Hugo Herbelin , this might not even introduce ambiguity under your shortest-path idea. OTOH, it might be too late now.
Paolo Giarrusso said:
One could add adapter modules with shorter names to the stdlib. Say, have
Coq.List
re-exportCoq.Lists.List
(or viceversa). Hugo Herbelin , this might not even introduce ambiguity under your shortest-path idea. OTOH, it might be too late now.
I like the idea. No need to remove From
actually.
Last updated: Jun 08 2023 at 04:01 UTC