So many of the Coq projects I use/maintain spread out .v
over several directories, but still use the same logical path for all those directories. For example, the Cheerios
logical path has .v
files in both the core
and extraction
directories (which are siblings). With _CoqProject
, this is easy to set up. However, it seems this design is currently not possible with dune? I get stuff like:
file "extraction/dune", line 2, characters 7-15:
2 | (name Cheerios)
^^^^^^^^
Error: Duplicate theory name: Cheerios
is the only solution to assign different paths like Cheerios.Core
and Cheerios.Extraction
?
I think so, yes.
That would be possible, however you get into tricky semantics w.r.t. to merging
so I'm not sure it would be worth it
actually Cheerios is the first project I see that is setup like that
others that are setup like that: Verdi, Verdi Raft
Another argument against merging is what happens when the name collision is not intentional
for example when composing
we could add a (merge)
field maybe for these cases
so I dunno
I'm OK trying to support it if you think that's useful
but I need more coffee to think about it, it seems scary on first thought as esentially you would move from a CWA for (coq.theory)
to an open world
and that could have very interesting consequences
Also that's the issue of flags conflicts etc...
another approach for this problem is to allow aliases , remapping
another choice is to implement (include_subdirs unqualified)
for Coq [not sure Coq will like it tho
In fact, we were thinking of making a double binding in-Q/-R
an error in Coq
I believe it is already a warning in coq_makefile
BTW, isn't it?
As indeed there is the problem of module mapping [but that can be solved in Dune] tho
So, I think there are many ways to address this setup.
Good point @Théo Zimmermann , I am not sure what the status w.r.t. is
two other projects not by me that use the same namespace for several dirs: elliptic-curves-ssr and Toychain
@Emilio Jesús Gallego Arias I've come to the conclusion that having the same namespace/logpath across several physical directories would be really useful, partly because it doesn't force additional naming issues just because directory partitioning makes sense. Should I open dune issue?
I think we still need a convincing story on how to handle (coq.theory ...)
working on an open world assumption safely. This seems far from trivial I think. Personally, I'm not sure the complexity or merging physical namespaces into a single logical one is worth it. What do other languages do? [for example Haskell, Rust]
if there are no module /.v
-file naming collisions, I'm not sure I see issues for Coq? Just take union?
What about conflicts?
How does dune know then two things are supposed to be merged, vs when two things collide by accident?
Is Coq the only language doing this? Some parts of Coq are unnecessarily complex due to this merging feature, and I remain unconvinced the tradeoff is good enough
The model where one logical path corresponds to a single directory has many advantages, and so far only a quite minor annoyance
Very large codebases have been built on this model
So indeed I'm not opposed to supporting directory merging, but it is more complex than it looks.
you could just say: conflicts not allowed, exception
I dunno, I guess we can discuss more on the dune issue, but I would still need to see a more concrete proposal [you don't only have to merge dirs, but also stanzas] , and if Coq is the only lang that wants this, it makes me wonder a bit about that. The OCaml mode has something similar with (include_subdirs unqualified)
but it didn't go very well AFAICT and indeed it is scheduled for deprecation.
In a general sense, it seems to me that this feature would allow to have two conflicting views of the codebase: the filesystem, and the virtual Coq FS.
Why to duplicate this information?
There may be good reasons sometimes, to alias or merge stuff, but if it turns out that your physical layout is not good enough
shouldn't you correct that in the first place instead of adding yet another fully addressable layer?
For me directories are not different from files
we require a module name to match its filename, this helps to avoid duplication of information, and helps obviously with consistenc
so it makes sense to me that Foo.Bar.baz
should map to Foo/Bar/baz.v
modulo the problem with modules in modules, but that's a different issue
The issue is even more complicated as binding needs to happen in sub-trees
for example, mathcomp.ssreflect
and mathcomp.fingroup
So I dunno, it seems to me that instead of fostering the current way Coq handles its namespace, we should instead strongly move away from it, and design a better namespacing system
where use cases such as extending a namespace are better specified
SML doesn't enforce module/file being tied together
I tend to view Coq namespaces as something very loosely connected to physical file layout
Karl Palmskog said:
SML doesn't enforce module/file being tied together
How are module refs mapped to object files in SML?
it's compiler-defined
Karl Palmskog said:
I tend to view Coq namespaces as something very loosely connected to physical file layout
So that's my point, we already have a quite nice namespace called the file system, why does Coq need its own?
Poly/ML for example, allows one to declare modules almost anywhere (multiple per file), as long as you point the compiler to all files that declare something it will work out
Karl Palmskog said:
it's compiler-defined
Uhh that seems... not nice. So well, how do SML compilers perform that step?
Karl Palmskog said:
Poly/ML for example, allows one to declare modules almost anywhere (multiple per file), as long as you point the compiler to all files that declare something it will work out
which means you have to rescan the whole fs on an incremental build?
see http://adam.chlipala.net/mlcomp/ "Separate compilation conventions"
I think people use mostly make to build SML?
What says is "concatenate all files together"
that doesn't seem something we should take inspiration from I think
but it shows that the filesystem vs. abstract organization do not have to tied, and are not in big serious systems like SML. Another annoying thing about enforcing the directory tie is that namespaces become frozen, so no outside extensions
Another annoying thing about enforcing the directory tie is that namespaces become frozen, so no outside extensions
Right! Extensible namespaces are an interesting use case, although precisely the one that Emilio was most afraid about (I think).
Not sure I see what's interesting about SML , in fact the more I read the more I don't want to do many things like them
Indeed I mentioned before extensible namespaces
but Coq doesn't have namespaces now
I mean, arguably most of what we do is based on (S)ML is some way
How so?
I see cakeml module support : very basic
I see SML, it seems it build by concantenating all files
On the other hand OCaml is used at way larger scale
and indeed they are quite a few problems to solve
you mean, Coq's namespaces are not called "namespaces"? But that's how we use them, at least at the top level
regarding module organization, etc..., abstract vs implemented libraries
Coq doesn't have namespaces
it has the ability to bind a physical path to a logical pat
yes, and the top level, this is what anyone would call a namespace
they lack very basic things a namespace can do
for example, open mathcomp.*
etc...
Coq doesn't even have the notion of library, much less namespaces (as first class objects)
library in the sense of a group of modules
it is a very simple system, with many cases underspecified
you could implement namespaces using modules tho
but for some other reasons that fails [modules are inherently CWA]
I don't think there's an agreed definition of what a CS namespace "should" provide
well indeed, as for many other things, modules, filesystems, etc...
in fact a filesystem is a namespace
all I'm saying is, the current use of -R
and -Q
have allowed using multiple mappings to the same logpath, and there are several big projects that use this, and I personally I find it useful
I find that feature problematic, as it works today
moreover I have trouble understanding that you want (in Coq) Require A.B A.C
however in the FS you want dir1/B.v
and dir2/C.v
if you think so, and if dune is to replace coq_makefile, and this is not going to be supported in dune, I think the right thing to do is deprecate it in Coq right now
so I typically want depth of 1 in the file directory structure in my Coq projects
this means there are a bunch of directories without subdirectories at the root
I'd rather try to understand what we need, we are at an early stage
I think the right thing to do is deprecate it in Coq right now
But again, isn't it already the case? I thought coq_makefile
raised a warning when you pass it multiple -R
or -Q
.
and the current system just never had what I'd call a principled design
I thought coq_makefile raised a warning when you pass it multiple -R or -Q.
hmm, I don't think so, but would have to check
for example extension is needed
almost for sure
(coq.theory
(name Monad2)
(extends Extlib)
but let's say I absolutely want depth 1 in project. Now you're forcing me to come up with a Project.X
, Project.Y
, etc., for all my directories.
Karl Palmskog said:
but let's say I absolutely want depth 1 in project. Now you're forcing me to come up with a
Project.X
,Project.Y
, etc., for all my directories.
why you want a different view in Coq vs in the FS?
generally, I want my top-level directories to be named after the "kind" of thing they contain, like theories
, src
, plugin
, tactic
, etc., which doesn't work well as Coq logpath fragments
In fact, in the future it would be no so hard to imagine a system where Coq doesn't reveal you what the FS is
what you do in this case?
the IDE only offers you to create particular modules
and you don't know how Coq is storing them?
when I organize files, I think of my own needs, when I design logpaths, I think more about clients (package system, other users)
if I want a certain logpath design at install time, why does my current file layout have to dictate everything?
Karl Palmskog said:
generally, I want my top-level directories to be named after the "kind" of thing they contain, like
theories
,src
,plugin
,tactic
, etc., which doesn't work well as Coq logpath fragments
Yup, I understand that part, but saving one extra subdir seems still a bad tradeoff in terms of complexity vs utility.
I'm open to discuss more on the Dune issue, but so far a few of the questions I have on how the thing is supposed to work IMHO need to be addressed
Karl Palmskog said:
if I want a certain logpath design at install time, why does my current file layout have to dictate everything?
Because as a matter of code complexity, we already have a namespace system which is the filesystem, so to add a few hundreths lines of code to the codebase, which complex specification, I'd personally need a better use case.
I like simplicity, for me, if I can look at the FS and know how the module structure is going to be, that's a plus
A big plus, one abstraction less. Convention over configuration
So it is a matter of tradeoffs IMO
I think the issue here is not about whether convention is preferable over configuration, but whether configuration should be at all possible. Dropping the possibility of configuration is a big deal IMO
If you allow configuration, I have to perform the abstraction step itself
I don't see the big deal here
many other systems do require a particular fs layout if you want your modules that way
and that's a good idea
imagine the hell otherwise
with every person imagining their own creative way to arrange stuff
now I have to parse all build files
to see where A.B.C
is
and IDEs have a harder time, etc...
what does cargo do in this sense
?
Karl Palmskog said:
I think the issue here is not about whether convention is preferable over configuration, but whether configuration should be at all possible. Dropping the possibility of configuration is a big deal IMO
I think that configuration should still be possible. The way people get around these limitations in dune is using subdirs
and copy_files
. It's not really a great way to do configuration, but you should be able to maintain w/e file system structure you want, provided that you tell dune to translate it into a structure it can understand.
Rudi Grinberg said:
The way people get around these limitations in dune is using
subdirs
andcopy_files
. It's not really a great way to do configuration, but you should be able to maintain w/e file system structure you want, provided that you tell dune to translate it into a structure it can understand.
I'm not fully sure I understand the manual here, so let's take Verdi Raft as an example. The .v
files are spread out across the raft
, raft-proofs
and systems
directories. If I want to these files be built as if they all lived in the same directory (either the root directory or a theories
subdirectory not in version control) and all become mapped to the Coq logpath VerdiRaft
, would I most directly/conveniently use subdirs
? Or copy_files
?
Try doing it with just copy_files
. You will find that this works fine, but the amount of dune files in otherwise empty directories with only copy stanzas will annoy you. At that point, you can just move re-org everything into one dune file using subdirs
.
Indeed there is copy_files, yup, that's a workaround/hack [you copy the files to a single dir] A bit ugly tho.
I had a very brief look at Cargo / NPM and they seem to strongly assume that module paths are filesystem paths. I dunno, but indeed the discussion is interesting and I suggest we move it upstream; even if we decide to stick to view closer to the actual FS, there are still a few cases where this is not enough [for example namespace extension] Another thing to look at is OCaml's own plans to support (include_subdirs qualified)
Emilio Jesús Gallego Arias said:
indeed the discussion is interesting and I suggest we move it upstream
So you think I should open a GitHub issue (feature request)?
Actually I've been thinking of that, and indeed I think you are right in that we should maybe first think the model in Coq, and if we want to support path rebiding in Coq we should also support it in Dune ; my current option is to use (extends LogicalPath)
to be used in place of theory, so indeed two (theories ...)
will still clash.
I think we had an "implicit" idea in Coq to move toward deprecating the double-binding, but indeed it is not clear that would be too restrictive.
So maybe indeed we should open an issue on Coq first, as to see what the roadmap for load paths is? I was planning to do this once we had common code for LoadPath handling, as of today such code is duplicated. What do you think?
OCaml is also working on revamping their loadpath setup by the way.
right now, there is no warning when passing, e.g., -Q core,Cheerios -Q extraction,Cheerios
in _CoqProject
@Emilio Jesús Gallego Arias sure, we can open a Coq issue, but maybe easiest if you create it in that case, I can definitely fill in examples and so on if you ping me in
as mentioned above, I know 5+ large-ish projects that use this [logpath merging]
Karl Palmskog said:
right now, there is no warning when passing, e.g.,
-Q core,Cheerios -Q extraction,Cheerios
in_CoqProject
Indeed, there is no warning for that case, tho it is handled a bit inconsistently over the different copies of the loadpath code
Ok, I will open the upstream issue; actually it was already on my todo list; I've added your comments about Dune there.
@Karl Palmskog so to put my notes in order, do you think you can live in the short term with (copy_files ...)
for that issue, or should we get a workaround in the next Dune release?
I managed to get the Coq part of Cheerios to work locally with copy_files
, although it's a bit ugly. So it's not an urgent thing to address from my side.
this also means Verdi and Verdi Raft should be portable to dune right now, which was my short-term goal.
so in conclusion, I think the theory composition issue is way more important to address in an upcoming release than this one
I agree, main blocker is that I didn't fully figure out the interaction of profiles, modes (native), and composition; need a feew cycles
On the other hand allowing the multiple binding is easy so it could be done; that worries me more in the sense that we already considered that feature a bit hacky so I'm way of supporting it and a lot of people relying on it just for us to realize we'd like to implement a different way of organizing stuff; in a sense I try to be very conservative introducing features as they should be supported for a reasonable time once in
for the long term, I think there are only two choices: (1) deprecate multiple binding via coqc (and don't add feature to dune) or (2) support multiple binding in dune (and don't change coqc). The remaining options don't make sense to me.
There is a 3rd option which would include a full redesign of how Coq handles libs
For example with interfaces, etc...
Just run into "Duplicate theory name". The error is pretty obscure — it should point to _all_ locations involved, and there should be some explanation of how the conflict is detected — or some way to debug it.
In my case, it turned out the dune
file appeared in multiple locations, due to a stray symlink.
For extra confusion, this project _does_ rely on merging, but across different opam packages. That might or might not be confusing dune (I get some strange errors, but some seem my fault and not dune)
@Paolo Giarrusso I would need more deatisl to see what you mean; I'll improve the overload conflict message
note that as noted in the manual composition doesn't yet work across package scopes
that's a PITA
it took a while to fix for the reasons outlined in the thread about substitution of installed libraries
for now, my testcase on the error is that ln -s theories foo; dune build
gives
File "theories/dune", line 3, characters 7-14:
3 | (name bedrock) ; This will determine the toplevel
^^^^^^^
Error: Duplicate theory name: bedrock
for the rest, I don't get the details too well myself — I know I have two projects a and b, with b depending on a, and which both use -Q theories bedrock
in their _CoqProject, but have no collisions
I was adding dune support to an existing build system so I didn't even realize this symlink existed
for the rest, I don't yet know enough dune to follow, but some day I'll learn :sweat_smile:
@Paolo Giarrusso have you tried using copy_files
, e.g., in theories/dune
:
(copy_files ../project1/*.v)
(copy_files ../project2/*.v)
(coq.theory
(name bedrock)
(package coq-bedrock))
Thanks @Karl Palmskog; I'll keep that in mind, but it might not be needed (at least for building those separately). I might try symlinks instead...
Paolo Giarrusso said:
Thanks Karl Palmskog; I'll keep that in mind, but it might not be needed (at least for building those separately). I might try symlinks instead...
FYI, copy_files
will use symlinks when possible. Its preferable than doing it manually as it will make sure your build will work on lesser OS's like Windows.
@Karl Palmskog have you managed to build https://github.com/uwplse/verdi-raft or https://github.com/uwplse/cheerios with dune? I was looking at those for inspiration, but didn't find dune rules
meanwhile, I've tried copy_files, but I haven't found a way to make it recursive... from the suggestions using subdir
above, I came up with the following
(copy_files ../cpp2v-core/theories/*)
(subdir lang
(copy_files ../../cpp2v-core/theories/lang/*)
(subdir cpp
(copy_files ../../../cpp2v-core/theories/lang/cpp/*)
(subdir syntax
(copy_files ../../../../cpp2v-core/theories/lang/cpp/syntax/*))
(subdir logic
(copy_files ../../../../cpp2v-core/theories/lang/cpp/logic/*))
(subdir semantics
(copy_files ../../../../cpp2v-core/theories/lang/cpp/semantics/*))))
Sadly, this works, but this can't be the currently correct way, can it?
(not "sadly", since it seems to work really well)
@Paolo Giarrusso I'm still working on the OCaml build for Cheerios and Verdi Raft unfortunately, it turned out to be really complicated. The Coq parts build just fine (but no dune
files public yet)
I think I will commit what I have in some Cheerios branch in the near future
if you want to see some Coq+OCaml+extraction project using Dune, Stalmarck is my current best showcase: https://github.com/coq-community/stalmarck -- but it avoids completely the "same logical path in several directories" problem
Paolo Giarrusso said:
meanwhile, I've tried copy_files, but I haven't found a way to make it recursive... from the suggestions using
subdir
above, I came up with the following(copy_files ../cpp2v-core/theories/*) (subdir lang (copy_files ../../cpp2v-core/theories/lang/*) (subdir cpp (copy_files ../../../cpp2v-core/theories/lang/cpp/*) (subdir syntax (copy_files ../../../../cpp2v-core/theories/lang/cpp/syntax/*)) (subdir logic (copy_files ../../../../cpp2v-core/theories/lang/cpp/logic/*)) (subdir semantics (copy_files ../../../../cpp2v-core/theories/lang/cpp/semantics/*))))
Sadly, this works, but this can't be the currently correct way, can it?
Yeah, that's quite ugly. Could you make a ticket for this? We should think of a way to make copy_files smarter than this.
"implement copy_files_recursive
"? "add support for **
-style wildcards?"
ah, https://github.com/ocaml/dune/issues/3387 exists
meanwhile, luckily other people like disjoint logical paths as well, which lets me avoid the above.
Last updated: Jun 04 2023 at 23:30 UTC