Stream: Coq users

Topic: styleguide


view this post on Zulip Quinn (Nov 01 2021 at 20:50):

Suppose more than one proof engineers are working on proving stuff about a product. Anyone with experience in this kind of scenario have wisdom they'd like to share about code quality? I searched zulip history and found a comment that the anti-pattern guide is the closest thing there exists to a style guide

view this post on Zulip Gaëtan Gilbert (Nov 01 2021 at 20:53):

I wouldn't call that a guide, it's just a random wiki page

view this post on Zulip Quinn (Nov 01 2021 at 21:18):

i think it's a start-- it would form a citation in a business' internal style guide

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:39):

Preach. I've seen https://gitlab.mpi-sws.org/iris/iris/-/wikis/style-guide, and math-comp also has a document somewhere, but this isn't a trivial problem. Neither guide is very complete.

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:42):

while you’re at it, you might also wonder about tool support. I suspect VsCoq and ProofGeneral have different indentation rules; they certainly have different commenting styles.

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:44):

the Coq tracker has an issue about automatic formatters.

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:46):

Even comment layout is tricky. I used to use Javadoc-style comments:

(** Coqdoc comment: short summary.
  * More details *)

but if you actually run Coqdoc, that’ll turn More details into a title :-(

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:47):

math-comp recommends instead

(* Line1 *)
(* Line2 *)

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:48):

I guess it could work, but VsCoq doesn’t help set that up, and I haven’t seen much code outside math-comp with a consistent comment layout.

view this post on Zulip Quinn (Nov 01 2021 at 21:49):

i will check coqdoc interaction before I commit my team to a commenting style, thanks!

view this post on Zulip Paolo Giarrusso (Nov 01 2021 at 21:50):

@Quinn Thanks! And should you come up with something, feel encouraged to share. Hopefully I can do the same, if we find a solution.

view this post on Zulip Quinn (Nov 01 2021 at 21:56):

give my team a good half year and we'll have some experience to back up whatever style guide I end up writing, or some other team publishes a style guide and we rally behind them. A cool vision would be if several teams consolidated around one style guide, then emacs/vscode plugins could be written based on them.

view this post on Zulip Karl Palmskog (Nov 02 2021 at 07:32):

the most detailed "style guide" around is for MathComp: https://github.com/math-comp/math-comp/blob/master/CONTRIBUTING.md

there is also a guide which has some style recommendations but focuses on high assurance requirements: https://www.ssi.gouv.fr/uploads/2014/11/anssi-requirements-on-the-use-of-coq-in-the-context-of-common-criteria-evaluations-v1.0-en.pdf

there will likely be a proliferation of different incompatible styles in the Coq community as long as Coq itself is around...

view this post on Zulip Paolo Giarrusso (Nov 02 2021 at 08:40):

Sure, and most code formatters are configurable. But thankfully, configurations don’t make them surjective!

view this post on Zulip Théo Zimmermann (Nov 02 2021 at 08:54):

It would be a great project to, at least, consolidate all the available style guide / code formatting knowledge into a single document, which would show the different options for the different elements and which team adhere to which.

view this post on Zulip Théo Zimmermann (Nov 02 2021 at 08:56):

That's the kind of thing that could be hosted in coq-community for instance (though the Coq wiki is also fine if a wiki page is suited for that).

view this post on Zulip Karl Palmskog (Nov 02 2021 at 09:00):

we linked to a bunch of different style guides in the intro section of our IJCAR 20 paper, and then there are some refs in our Coq Workshop '20 abstract

view this post on Zulip Karl Palmskog (Nov 02 2021 at 09:06):

the issue of styling is also highly related to documentation generation, which leads to Alectryon, SerAPI, coqdoc, dependency graphs using coq-dpdgraph and coqdep, and the like

view this post on Zulip Karl Palmskog (Nov 02 2021 at 09:09):

one early style guide I co-authored: https://github.com/uwplse/verdi/blob/master/STYLE.md

view this post on Zulip Quinn (Nov 02 2021 at 10:20):

ok this is great, bookmarking this thread, thanks!

view this post on Zulip Bas Spitters (Nov 03 2021 at 15:03):

Here's another one that's probably already on the list:
https://github.com/HoTT/HoTT/blob/master/STYLE.md

view this post on Zulip Quinn (Nov 04 2021 at 00:30):

@Karl Palmskog I think I'm gonna use verdi/PROOF_ENGINEERING.mdin my team. Thanks!

view this post on Zulip Quinn (Nov 04 2021 at 00:33):

@Bas Spitters that's an informative and educational document! Using typeclasses to encode axioms is a nice pattern.

view this post on Zulip Quinn (Nov 04 2021 at 00:39):

no unnecessary type declarations for quantified variables @Karl Palmskog i'm not 100% convinced I find this desirable. I'll be tempted to enforce the opposite.

view this post on Zulip Quinn (Nov 04 2021 at 00:39):

what are some of the advantages you see?

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 06:54):

@Quinn Many people like explicit annotation for function arguments. Do you want to repeat those declarations in 10 lemmas about a function living next to it? 20? How much is too much?

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 06:56):

Once I have gotten pushback, but the compromise was to predeclare argument types Implicit Types ...

view this post on Zulip Karl Palmskog (Nov 04 2021 at 09:10):

Main problem: types take a lot of space which may hinder readability. I think explicit types in lemmas can appear in documentation instead, for example using Alectryon.

view this post on Zulip Enrico Tassi (Nov 04 2021 at 09:20):

Maybe "no unnecessary" cane be understood in a too strong way, eg. Coq can infer them reliably, which implies you/your-readers know well enough how type inference works. But for sure putting types everywhere is too verbose to be practical and most of the times it really does not help. Eg take forall x xs, mem x xs -> ...., why would typing both x and xs help the reader? It would only help if the names were chosen badly, as in forall x y, mem x y -> ... (which is which?).

view this post on Zulip Ana de Almeida Borges (Nov 04 2021 at 11:58):

Some times I like having explicit type annotations even when Coq can infer them, mostly when dealing with many types that have coercions between them and shared notations that depend on which scope is open. It can get confusing fast, specially for someone not familiar with the code.

view this post on Zulip Quinn (Nov 04 2021 at 13:02):

This is interesting to consider!

view this post on Zulip Michael Soegtrop (Nov 04 2021 at 13:29):

I would say the key goal when deciding if a type annotation should be there or not is to make the code more readable to a medium advanced user. IMHO some authors either optimize it for writability (= leave them away wherever you can) or for readability of readers at their own level of expertise (with the specific piece of code, which likely few people have). I find it helps to ask others from time to time what their opinion is on ones code style.

view this post on Zulip Bas Spitters (Nov 04 2021 at 16:13):

There is also the linter for lean:
https://arxiv.org/pdf/2004.03673.pdf

view this post on Zulip Quinn (Nov 04 2021 at 16:45):

if hypothetically i wanted to write lint for coq, could i use serapi in python?

view this post on Zulip Karl Palmskog (Nov 04 2021 at 17:26):

@Quinn if you want to write a mechanical (rule-based) linter using Python, either SerAPI or PyCoq is probably the way to go. Our experimental machine learning based formatter specifically used output from the sertok and sercomp tools provided by the coq-serapi opam package.

view this post on Zulip Karl Palmskog (Nov 04 2021 at 17:27):

I believe there is some Python based tooling for dealing with output from sertok and sercomp here, if that's the route you want to go:

view this post on Zulip Karl Palmskog (Nov 04 2021 at 17:42):

in any case, we strongly advise against trying to parse Coq code without using Coq's parser at the OCaml level or using something like SerAPI or PyCoq. Because both the Coq parser and lexer are extensible, handling all code correctly in an external parser is near-impossible (one can't even assume a sentence ends by . due to notations)

view this post on Zulip Quinn (Nov 04 2021 at 17:43):

Oh yeah I'd never in a million years attempt it on my own, haha. I had a glance at the pycoq repo and that might have what I'd want to make a linter

view this post on Zulip Emilio Jesús Gallego Arias (Nov 04 2021 at 18:00):

If you wanna use python, pycoq is the way to go; we did add a lot of updates so ping me @Quinn if you start using it, as of now, what's on github is very preliminary, but we can push an OPAM release soon

view this post on Zulip Quinn (Nov 04 2021 at 18:14):

outstanding, thanks!

view this post on Zulip Quinn (Nov 04 2021 at 18:16):

(realistically it might not end up being a priority because my boss isn't even 100% sure we're hiring other proof engineers besides me, and a linter on a team of one doesn't have a high impact) (I'd love to dive into it, but it might be something I can't bill)

view this post on Zulip Quinn (Nov 04 2021 at 18:17):

(i mean most probabiltiy mass is it being something i can't bill, which i'm fine with!)

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 20:48):

@Quinn FWIW we'd also be interested, and I'm sure more people would, even if I can't make any promises...

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 20:49):

And while we're talking of tooling, stuff in https://github.com/JasonGross/coq-tools might be helpful!

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 20:50):

(for instance minimize-requires!)

view this post on Zulip Quinn (Nov 04 2021 at 20:50):

@Paolo Giarrusso in a linter? yeah i'd make sure it's in a public-facing repo (even if I have to host it from my employer's github org account) and do what i can to make it generally applicable to teams outside of my own.

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 20:55):

Any starting point is probably great... Different users probably have different ideas on the actual policies, but honestly for an MVP, I guess plain Python is a perfectly reasonable policy language... I guess as long as different rules are enforced "separately" or "modularly"?

view this post on Zulip Quinn (Nov 04 2021 at 20:57):

mvp: my styleguide hardcoded into an executable that exit code 1s if there's at least one warning and exit code 0s otherwise
v2: ship with various teams' styleguides as cli options
v3: yaml config
v4: emacs mode
v5: formatting cli such that formatter input.v; linter input.v raises zero warnings

view this post on Zulip Quinn (Nov 04 2021 at 20:57):

what do you mean separately/modularly?

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 20:59):

sorry that was probably premature, I don’t mean anything too surprising

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 21:00):

just that having different warnings in different… modules/functions (or w/ever) might be easier to manage/configure/…

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 21:01):

but it’s not my place to say!

view this post on Zulip Quinn (Nov 04 2021 at 21:05):

yeah without having looked at serapi python port or pycoq yet, my guess is i'd want 1 def per rule and a config amounts to a list[callable], where each callable would be of signature whateverPyCoqClassRepresentsADotVFile -> Bool, though rules could be parametric and a functools.partial could be used to make different flavors of the same rule.

view this post on Zulip Quinn (Nov 04 2021 at 21:08):

some flavor of all(map(lambda f: f(dot_v_file_representation), config) that's aware of warnings and aggregates them. Cuz I guess the return type wouldn't be bool because we'd want to tag False with a str. Oh yeah, it's not strictly fp so I can just throw a print before returning the bool, lmao

view this post on Zulip Paolo Giarrusso (Nov 04 2021 at 21:11):

I was also thinking of option string

view this post on Zulip Quinn (Nov 04 2021 at 21:16):

ah, indeed. or list[str] where "" instead of None: falsey corresponds to successful rule, truthy/a message corresponds to a failed rule

view this post on Zulip Quinn (Nov 04 2021 at 21:20):

(realistically my timeline is 0. finish a major non-proofengineering deadline on the 15th, 1. finish a PoC to confirm that my team is actually using coq and not agda, 2. research upthread styleguides to write my team's styleguide, then maybe 3. write linter)

view this post on Zulip Tim Carstens (Dec 01 2021 at 22:20):

@Quinn I might have some availability to work with you on this. I'm currently putting together an effort to organize a Coq style guide. A linter would be a powerful tool. My December is somewhat busy but I'd be excited to talk regardless

view this post on Zulip Quinn (Dec 06 2021 at 03:07):

Hi! Thanks for reaching out. All I've done is assemble notes from this thread into a directory. My team isnt even 100% sure we're using coq yet still so it's up in the air whether I'll really be in the soace

view this post on Zulip Tim Carstens (Dec 06 2021 at 03:27):

Sounds good. I've been collecting resources in a repo: https://github.com/appliedfm/vstyle

The guide is currently light on content. I've been using the issue tracker to take notes about existing resources, tooling support, etc.

For tooling, I've played around a bit with serapi and think it's good enough for now. Some stuff is missing/incomplete (comments, in particular) but I think there's enough functionality to get things started. The s-expression representation contains file offset information; in theory a code formatter could use this to perform all kinds of surgery on files. Not my favorite long-term approach but it could certainly be done.

Another possibility would be to introduce a configurable pretty printer into Coq. But I have not investigated that at all.

view this post on Zulip Théo Zimmermann (Dec 06 2021 at 08:18):

FWIW, there's already a "beautifier" in Coq that originates from the V7 -> V8 syntax translation tool and is quite broken at the moment. This could be a basis for writing an auto-formatter in Coq itself, but I feel like the external approach based on SerAPI is more likely to be successful.

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:13):

I still believe a machine learning-based formatter aided by manual so-called reranking rules is the way to go (https://arxiv.org/abs/2006.16743). For example, here is a manually formatted file and what our experimental tool predicted from a raw Coq token stream:

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:15):

notably, we didn't handle bullets all that well in a naive predictor, since they are sentences by themselves. If one comes up with a nice small abstraction on top of Vernacular (e.g., special dummy tokens for bullets), this could be handled quite easily though. Same for Qed. and Proof..

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:19):

I'm not sure people are aware, but the Coq Platform is basically a huge machine learning dataset (1.7+ million lines of Coq code guaranteed to work for a specific Coq version).

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:26):

what a general "Coq style project" could provide is a manually curated (Python?) API/AST for Coq tokens/documents, based on the actual Coq AST/API as exposed by SerAPI and/or PyCoq. Once you have that, people could implement their own formatters for this API/AST, some rule-based, some learning-based. See also for Lean: https://link.springer.com/chapter/10.1007%2F978-3-030-53518-6_16

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:29):

for example, the SerAPI representation of the bullet - is something like: Sentence(Bullet("-")), while you probably don't want bullets to be considered standalone sentences for formatting (e.g., they could belong to the next sentence in the document)

view this post on Zulip Karl Palmskog (Dec 06 2021 at 10:30):

I also think Coq-Elpi already has abstractions for some syntax-level Coq constructs?

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:00):

here is an early attempt at a mechanical formatter that's been mentioned in other similar topics: http://www.eelis.net/coqindent/

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:02):

we link to 3-4 style guides for Coq (and Lean) here, some have been mentioned before: https://arxiv.org/pdf/2004.07761.pdf#page=2

view this post on Zulip Tim Carstens (Dec 06 2021 at 14:02):

Thank you both for your thoughtful comments :heart:️ I'm going to put some experiments together with serapi (and follow up on these links!)

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:04):

@Tim Carstens see also the sertok CLI tool that Emilio and I developed, which is installed when you install coq-serapi. This tool does not actually use SerAPI-the-API, only the Serlib library for Coq and the OCaml interface to Coq: https://github.com/ejgallego/coq-serapi/blob/v8.13/sertop/sertok.ml -- if you run it on some Coq code it shows a bit of how Coq's extensible lexer works, and how token locations are stored.

view this post on Zulip Tim Carstens (Dec 06 2021 at 14:07):

I was playing with it (sertok) last night - I'm completely convinced that it would work well in this use case

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:13):

OK, in this case I should flag up that it should be possible in principle now to hack SerAPI to have sertok provide JSON output. When we first did sertok, sexps were the only viable alternative, but I think Serlib has (nearly?) full JSON support now.

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:30):

in any case, the sertok output will be subtly different for each Coq version (since you're looking at Coq internals), which is why one might want to abstract

view this post on Zulip Tim Carstens (Dec 06 2021 at 14:33):

Fortunately these days we have github actions & a beautiful coq platform, both of which make extensive regression testing possible <3

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:48):

also possibly relevant, we have a dataset from before the Coq platform (Coq 8.10) which for a bunch of projects has .v files, token sexp files, and AST sexp files, e.g., https://github.com/EngineeringSoftware/math-comp-corpus/tree/master/raw-files/math-comp_analysis

view this post on Zulip Karl Palmskog (Dec 06 2021 at 14:50):

we tested them for the serialization-deserialization roundtrip, which is probably what one would need to set up for a new tool using the platform

view this post on Zulip Tim Carstens (Dec 17 2021 at 18:41):

Dropping a quick update:

I’ve started writing a formatter/linter. I’m writing it as a command line tool in OCaml to keep dependencies simple. (If the tool becomes useful, a later effort could integrate it with vscode/emacs/etc.)

I’m new to OCaml but things seem to be going smoothly. I referred to pycoq for an example of how to use serapi as a library.

Thank you again everyone for your input, I’m looking forward to dropping a link to the repo Real Soon Now 👍

view this post on Zulip Tim Carstens (Dec 20 2021 at 04:31):

repo link: https://github.com/appliedfm/vstyle-tools

Currently based on Coq 8.14. The logic is based on sertok. I've run into two gaps:

Example output shown in https://github.com/appliedfm/vstyle-tools/pull/1

Still, this is an encouraging start :blush:

Edit: I revisited the code and realized I could be pretty-printing the Vernacexpr.vernac_control_r CAst.t that I get from calling Stm.parse_sentence. This doesn't help with the code comments, but it does give a structured AST. This means I'm unblocked :+1:

view this post on Zulip Théo Zimmermann (Mar 16 2022 at 14:40):

Anyone wants to answer this question with the state of style guides in Coq? https://proofassistants.stackexchange.com/questions/984/naming-conventions-letter-case-underscores-c-for-coq

view this post on Zulip James Wood (Mar 16 2022 at 14:42):

I would also like to know!

view this post on Zulip Ana de Almeida Borges (Mar 23 2022 at 19:19):

@Cyril Cohen wrote a linter for mathcomp that never got merged: https://github.com/math-comp/math-comp/pull/163

view this post on Zulip Ana de Almeida Borges (Mar 23 2022 at 19:20):

Maybe it could be published somewhere as a standalone?

view this post on Zulip Bas Spitters (Jun 16 2022 at 12:45):

This thread seemed to have stalled. Is anyone still thinking about these issues?
A concrete use case could be to have something for e.g. Software Foundations.

view this post on Zulip Ana de Almeida Borges (Jun 17 2022 at 10:35):

One more example of an old draft of a style guide: https://github.com/coq/coq/wiki/CoqStyle

view this post on Zulip Ali Caglayan (Jun 17 2022 at 11:41):

Note that this style also is missing bullets which I would say the style of the Coq stdlib suggests.

view this post on Zulip Ali Caglayan (Jun 17 2022 at 11:42):

Mathcomp have their own style guide, so any document should point to a range of styles rather than impose a singular one.

view this post on Zulip Enrico Tassi (Jun 17 2022 at 14:46):

This old pr is not completely orthogonal to this discussion. With the dev hat on: we will be soon able to write tools which parse coq files (using the coq parser) without executing a single line (CC @Maxime Dénès ). Writing a proper linter (I mean, not based on sed/awk) will then be possible.


Last updated: Mar 29 2024 at 14:01 UTC