This is the topic to ask questions on the talk "Learning to Format Coq Code Using Language Models".
Talk will start in 5 minutes
:rolling_eyes:
Q: you mentioned that rule-based tools don't work very well in practice. But your example about Qed
not being in a new line is trivial to capture with a rule. What about mixing the two approaches?
The proof goal structure is important, we really need you to find a way to incorporate this into the treatement.
Remark: I find it amazing that you manage to indent SSR code without running Coq (indentation is based on the number of open goals, which you are apparently predicting correctly). Wow!
The by keywords make it posible to reconstruct the number of goals, but you need lookahead.
For the record: the reglang
library is not aspiring to adhere to the mathcomp
coding guidelines, in particular not as it comes to indentation and the use of bullets.
this was not intended to disparage reglang, just to show how the tool can be applied to "new" projects (outside the training corpus). Probably, proper formatting of reglang would require additional training on reglang itself
Yves Bertot said:
The by keywords make it posible to reconstruct the number of goals, but you need lookahead.
Right, the current model simply uses the keywords like by
to infer indentation without running Coq. We have some WIP in SerAPI on accessing the proof goal structure in SerAPI, but it's not 100% clear to me how one would express a "proof goal location for a token" in a good way.
for the record, the current main limitations:
Qed
and Defined
.here is the complete prediction for polydiv.v
which shows these issues quite clearly. However, I think it also shows that one can get quite far without much special treatment of Coq features.
Last updated: Jun 01 2023 at 12:01 UTC