Our proposal(s) for how to handle formatting and other conventions, submitted to the Coq Workshop: https://cozy.ece.utexas.edu/~palmskog/coqws20draft.pdf
perhaps of interest to you @Bas Spitters --- essentially the long-term plan is to replace the beautifier and other ad-hoc tools like Eelis's old Coq indenter: http://www.eelis.net/coqindent/
this is also related to our IJCAR paper on suggesting lemma names: https://arxiv.org/abs/2004.07761 (currently working on public release of this toolchain)
@Karl Palmskog This is very interesting, Thanks!
The paper discuss having applied it to MathComp. Out of curiosity, have you given a shot at applying it simply to the standard library as well?
@Yannick Zakowski I have asked around and there is no consensus on what parts of the standard library are "good" (in terms of conventions). And it's widely recognized that the quality of naming and formatting is uneven (many people worked on it independently). This is why we didn't try it. So there is no technical barrier to running our tool on stdlib, but I wouldn't know how to interpret results as well as for MathComp.
I see that makes sense indeed. And very naively, would it be sensible to imagine applying the tool precisely to detect inconsistencies, having it pack together clusters of files that follow similar conventions?
If so I could see it being a great kickstarter to argue about which conventions we want to converge toward
I don't think it's possible to "partition" files in a large project. Basically, one would have to decide beforehand what files to train on, and then you could measure how well other files conform to the training set. This is theoretically possible to do for many training sets, but training time is quite long, e.g., training on MathComp takes many hours
Right. Thanks for the insights!
this is what predicted formatting with our best model currently looks like for MathComp's
polydiv.v: https://gist.github.com/palmskog/1384a0bcceb6c55d3f1126422d6fbb7f --- I would argue it's way better than what any static tool can do currently, but still quite of lot work to do, particularly inside proofs
I'm curious about your ideas on how to include this in the programming workflow.
I'm teaching a Coq course using Software Foundations. Is this something that could be useful to the students?
@Bas Spitters we have a number of ideas for how it could be integrated into workflows, going from integration with VSCoq and/or ProofGeneral, to integration on GitHub or as commit hook (some are listed in the CoqWS draft). What we prioritize depends on community interest. Realistically, we could do one integration over the summer. Concrete use cases, such as use in courses, would be helpful.
VScode and github may be possible in one go :
interesting, I'll pass that along.
Last updated: Oct 21 2021 at 20:02 UTC