ah, should we do another type classes vs. canonical structures thread? In my opinion, FMap/MSet is the current extraction and computing-in-Coq winner, and finmap is the ease-of-reasoning winner...
Rather than rehash a debate — does the use of canonical structures/type classes/modules affect the extraction result significantly?
My prior is that the two things are independent — except that functor applications might be monomorphized during extraction. Are they?
my experience is that type classes makes extracted code into an unreadable mess, with lots of indirection. FMap/MSet was designed to extract well from the start. canonical structures extracts terribly with lots of nesting, but this could be addressed as done here: https://doi.org/10.1016/j.scico.2019.102372
FWIW, here's a pointer on stdpp's potential performance issues: https://gitlab.mpi-sws.org/iris/stdpp/-/issues/64. To be clear, I find them extremely well-designed for proofs (which is all I've actually needed).
I'm still curious to know whether the lean4 type classes would work well in Coq. The authors claim they would.
I would have liked to do some kind of empirical study on modules vs. type classes vs. canonical structures, looking at everything from productivity, performance of extracted code, proof search time, extensibility cost, etc. I feel the topic is full of high-level contradicting claims and anecdotes without much data.
Talia and I wrote up some questions for Derek Dreyer on what he thought people should use, but his answers were quite vague -- from my recollection he didn't think there was a clear winner (even though he has published so much on all of modules/CS/TC)
I would probably start with the most simple problem: data mining on Coq dataset to see what people actually use and how. Could start with recording "Module Type" vs. "Canonical" vs. "Instance"/"Class".
Milos Gligoric and I have a related NSF proposal under submission, but regrettably it doesn't seem like it will be funded (should have heard by now). But if anyone is in a position to send in proposals along these lines to relevant funding agencies I'd be happy to contribute -- at least a student or two doing engineering work would be necessary for scaling.
@Karl Palmskog Someone (@Maxime Dénès ?) has done an experiment with reimplementing math-comp using type classes.
I believe it's in the wiki somewhere. The experiment should be reevaluated with the new lean4 type class algorithm.
Here's the experiment:
https://github.com/coq/stdlib2/wiki/EqClass
https://github.com/coq/coq/wiki/Typeclasses-and-Canonical-Structures
Also (resonates with modules in agda)
https://github.com/coq/coq/wiki/RecordsNotModules
if I remember correctly Vincent Laporte worked on the experiment as well, and it was presented in a Coq working group
Yes, the work was done by Vincent.
Unfortunately the data is going to be hard to analyze since Modules in Coq play many roles.
You can't seal a type without a module. You can't have 2 constants called the same if you don't put them in different modules. You can't limit the scope a thing (notation, TC instance...) if you don't put it in a module. In Math Comp modules are use for almost everything but abstracting/reusing code, which is my understanding of the purpose of modules in PL.
Also Coq modules were designed around (read copied from) OCaml modules, exactly to have a simple (one to one) extraction procedure. It is surely a plus if this is what you want, but it is not really a feature of "modules as a concept in PL". If you take a module system that does not trivially map to OCaml you quickly lose this (or you extract to Haskell or ...). I mean, if we had a magic wand the module system in Coq would be quite different. Eg. the dichotomy between ADT and opaque types, annoying in PL hence the translucent thing, is even more problematic in type theory where begin able to see the AST mean you can compute and hence type check more terms. When modules were added the grand plan was to later add computation rules to module types, so that one could seal a type but keep some computations. It never happened, there is some theory about CC + rewrite rules and some prototypes like CoqMT, but nothing landed in the module system. I'm not aware or any work on any work on CC + rewrite rules and extraction. So, if we had the module system we want, I've no idea if extraction would just wokr for it.
My2C
Do you know if anything was written (even informally) on that idea of adding reductions rules to modules (to interfaces in particular I guess) ? It sounds like a rather natural idea for modularity in CIC and I have been playing around with rewrite rules in Agda doing something similar.
I'm not aware. Let's summon @Hugo Herbelin , I'm sure he knows more.
The theory of Coq modules was developed in Judicaël Courant's PhD-thesis. I don't recall rewrite rules in there, but it's been a very long time since I looked.
Adding reduction rules with modules was considered/dreamt at some time. In addition to Judicaël, typical people interested at that time (late 90's, early 00's) were Jean-Christophe, Jacek Chrąszcz, Frédéric Blanqui, Daria Walukiewicz (see e.g. https://arxiv.org/abs/0806.1749), later also Florent Kirchner and Olivier Hermant, iirc. It comes with interesting but difficult problems though (how to ensure confluence/termination? do we restrict to declaring only beta-like rules? how to connect the implementation and the interface when what is definitional in the implementation is not in the interface and vice-versa?).
There is indeed also the work on CC+ rewrite rules by Blanqui-Fernandez-Jouannaud, later Strub, on the calculus of algebraic constructions (and probably other).
I looked into all publications related to Coq modules since we were interested in it for trustworthiness reasons, here is what I found in chronological publication order (did not include Courant's thesis since it seems superseded by the JFP article in 2007):
@InProceedings{Chrzaszcz2003,
author="Chrz{\k{a}}szcz, Jacek",
LONGeditor="Basin, David
and Wolff, Burkhart",
title="Implementing Modules in the {Coq} System",
booktitle="Theorem Proving in Higher Order Logics",
year="2003",
LONGpublisher="Springer Berlin Heidelberg",
LONGaddress="Berlin, Heidelberg",
pages="270--286",
LONGisbn="978-3-540-45130-3",
LONGdoi="10.1007/10930755_18"
}
@InProceedings{Chrzaszcz2004,
author="Chrz{\k{a}}szcz, Jacek",
LONGeditor="Berardi, Stefano
and Coppo, Mario
and Damiani, Ferruccio",
title="Modules in {Coq} Are and Will Be Correct",
booktitle=TYPES,
year="2004",
LONGpublisher="Springer Berlin Heidelberg",
LONGaddress="Berlin, Heidelberg",
pages="130--146",
LONGisbn="978-3-540-24849-1",
LONGdoi="10.1007/978-3-540-24849-1_9",
}
@article{Courant2007,
title={A module calculus for Pure Type Systems},
volume={17},
LONGDOI={10.1017/S0956796806005867},
number={3},
journal={Journal of Functional Programming},
LONGpublisher={Cambridge University Press},
author={Courant, Judica{\"e}l},
year={2007},
pages={287--352}
}
@inproceedings{Soubiran2009,
author = {Soubiran, Elie},
title = {A Unified Framework and a Transparent Name-space for the {Coq} Module System},
booktitle = {Workshop on Modules and Libraries for Proof Assistants},
LONGseries = {MLPA '09},
year = {2009},
LONGisbn = {978-1-60558-954-1},
LONGlocation = {Montreal, Canada},
pages = {38--45},
LONGurl = {http://doi.acm.org/10.1145/1735813.1735820},
LONGdoi = {10.1145/1735813.1735820},
LONGpublisher = {ACM},
LONGaddress = {New York, NY, USA},
}
@phdthesis{Soubiran2010,
TITLE = {{Modular development of theories and name-space management for the Coq proof assistant}},
AUTHOR = {Soubiran, Elie},
LONGURL = {https://tel.archives-ouvertes.fr/tel-00679201},
SCHOOL = {{Ecole Polytechnique X}},
YEAR = {2010},
LONGMONTH = Sep,
LONGPDF = {https://tel.archives-ouvertes.fr/tel-00679201/file/these.pdf},
}
on the topic of type classes vs canonical structures, we are seeing some *enormous* perf wins (40% for entire developments, 70% for individual files) by merging two layers of our canonical-structure-based hierarchy. So there are definitely some subtle perf cliffs not just with type classes.
This topic was moved by Théo Zimmermann to #Hierarchy Builder devs & users > Hierarchy builders and type classes
IIRC, the problem with deep hierarchies is well-known to the developers of the mathcomp libraries (e.g, it's one of the reasons why one cannot simply insert a semigroup structure into the current hierarchy, even though semigroups occur pervasively in CS). My understanding is that fixing this would significantly increase the (already significant) boilerplate required. Hence, the development of the Hierarchy Builder tool to automate this. @Enrico Tassi , @Cyril Cohen , or @Kazuhiko Sakaguchi can probably elaborate on how the HB generated hierarchies differ from the current mathcomp hierarchy.
The "depth" of the hierarchy matters for perf. HB should solve some of the problems that come from that. In particular the "class" projection chain is of constant length with HB, while it is proportional to the hierarchy depth is MC. Indeed to not copy paste an existing class you mention it in the subclass class record when you do things by hand (generating a sort of telescope), while HB can copy its contents (being a program he does not get bored)
We are not too far from being able to test HB on the full MC. Only then we will be able to confirm or not this claim.
Karl Palmskog said:
my experience is that type classes makes extracted code into an unreadable mess, with lots of indirection. FMap/MSet was designed to extract well from the start. canonical structures extracts terribly with lots of nesting, but this could be addressed as done here: https://doi.org/10.1016/j.scico.2019.102372
Thanks for mentioning my paper. In my opinion, nesting itself comes from (semi- or fully) bundled approaches to encode mathematical structures such as packed classes. But we can do that with type classes as in Lean's mathlib and will face the same nesting issue. Another issue specific to combining canonical structures and extraction is type coercion such as Obj.magic
to convert between a concrete type T
and sort T_instance
which are convertible in dependent type theory but not the same after extraction (see Sect. 8.3 of my paper).
There is also this work in agda that seems to have some overlapping goals with the hierarchy builder. It uses meta-programming (e-lisp) to convert between bundled and unbundled structures.
https://alhassy.github.io/next-700-module-systems/
speaking about bundled vs unbundled, the system designed in arend looks quite impressive at first sight (records/typeclasses are "bundled by default" but you can unbundle in a really lightweight way just by specifying some of the fields, for instance ring
stands for an arbitrary ring instance ring int
to an arbitrary ring structure on integers and ring int +
to an arbitrary ring structure on int such that the first operation is +
). I wonder how unification & type class resolution is implemented and how tracktable it gets with a larger development...
I agree. I've talked to Isaev about this. Unfortunately, there's no paper explaining the implementation.
Kenji Maillard said:
speaking about bundled vs unbundled, the system designed in arend looks quite impressive at first sight (records/typeclasses are "bundled by default" but you can unbundle in a really lightweight way just by specifying some of the fields, for instance
ring
stands for an arbitrary ring instancering int
to an arbitrary ring structure on integers andring int +
to an arbitrary ring structure on int such that the first operation is+
). I wonder how unification & type class resolution is implemented and how tracktable it gets with a larger development...
Hi, can you provide a reference for that?
I think Kenji is talking about this? https://arend-lang.github.io/about/arend-features#class-system -- but as Bas says there is no formal writeup, it's implementation-defined (in Java).
Hey everyone, IIRC from reading the canonical structures paper, one of the differences highlighted there was that type class resolution was type-driven, while canonical structure resolution was essentially term-driven (if not by intent, in practice), and the paper used it to guide the resolution process by creating opaque type aliases that would either dictate an instance to be used, or be peeled open to reveal the next one.
Is this still current? The paper made it sound like canonical structures were more powerful in this sense, though the details of getting this to work seemed excruciating.
This sounds correct to me. Canonical structures, as far as I know, have not changed at all.
Both are term driven, in the sense that types can contain terms in Coq. One may say that CS resolution is type triggered in the sense that CS resolution only happens when two types are unified, then recursively unification can compare terms and trigger CS there.
What may make one think that TC are type driven is that Check _ : SomeClass key
is enough to trigger a search for a TC instance for SomeClass
driven by key
, but key can really be anything, terms included. With CS it is a bit harder. In MathComp you have a bunch of notations like [eqType of key]
for that. Another way is to write Check refl_equal _ : proj _ = key
, which is not much more verbose but may be a bit more obscure than the TC counterpart, in particular one cannot just write a type, but also a term (refl_equal _
there). Anyway, in this case it is clear that key
can be anything (a term or a type, since = accepts both)
@Valentin Robert I'm not sure I get which paper you are talking about. If it is the one of Beta et al, then the chain of aliases is transparent and unification will unfold that step by step, trying to find a CS solution for proj _ = key1
, then proj _ = key2
(assuming Definition key1 := key2
). This trick (by Georges) lets one retrofit some backtracking in CS search, which is crucial in some applications, like reification in presence of variables or exploring a tree as in their paper (the lemma overloading thing).
Hope it helps
I guess one crucial difference to mention here for the sake of completeness is that TC are open in the sense that new instances can be added at any time. CS are closed: there usually is one canonical instance that points Coq to key1
and that instance cannot be changed after the fact. (I think @Gaëtan Gilbert has been experimenting with a design where new instances override old ones but I don't know how that is going.)
Today I read the CICM '20 paper about the linter of mathlib, Maintaining a Library of Formal Mathematics, and discovered there many interesting issues with type classes in mathlib (see §3.3). In particular, they explain that they have to unbundle a structure parameterized by another structure such as modules over a ring in order to avoid some kind of inference issues. It looks a bit ad hoc to me, and I do not understand what would be the consequences of the extensive use of this methodology.
AFAIK, we never experienced these issues with canonical structures in MathComp. (Indeed there are other kinds of issues with CS.) I think these examples would be helpful to compare type classes and canonical structures.
Last updated: Oct 03 2023 at 20:01 UTC