Stream: Coq users

Topic: Removing notations on input / beautify


view this post on Zulip Vincent Siles (Jun 12 2020 at 12:34):

Hi ! Is there any plan to update / fix coqc -beautify at the moment ?
I'm still trying to find a way to process input commands (not outputs) to strip them from user notation. The goal is to be able to process coq user development without depending on the notation people might use, only on the original definitions.
At the moment, the only way I know seems to be -beautify which seems a bit broken (missing new lines, errors, ...)

Just wondering if there are other interested by this, and if anything is currently on-going ?

view this post on Zulip Karl Palmskog (Jun 12 2020 at 12:36):

we have experimental work in this direction that uses a completely different approach, which we will present at the Coq Workshop next month: https://cozy.ece.utexas.edu/~palmskog/coqws20draft.pdf

You can see an example here of how it currently performs after only observing raw token streams: https://gist.github.com/palmskog/1384a0bcceb6c55d3f1126422d6fbb7f

view this post on Zulip Théo Zimmermann (Jun 12 2020 at 12:37):

-beautify has been relatively unmaintained lately but if you have specific needs you can bug Hugo about it.

view this post on Zulip Karl Palmskog (Jun 12 2020 at 12:37):

currently, for the experimental work, we do actually process notations (preserve tokens), but there is nothing inherent in the approach that makes it limited to this.

view this post on Zulip Karl Palmskog (Jun 12 2020 at 12:41):

if the only goal is to get out input without notation (i.e., no beautification/formatting), that is probably best handled through something like SerAPI

view this post on Zulip Vincent Siles (Jun 12 2020 at 12:50):

I made an attempt at implementing that in SerAPI, but it required much more knowledge of Coq's internal than I have :) I'll have a look at your approach, thanks ! The main idea comes from the heterogeneous state of Coq development in the wild: everyone has different notations for similar concept, so training learning models in this context is quite tricky, if we need to take each notations into account. Getting rid of them would allow us to process multiple sources without their specific notations.

view this post on Zulip Vincent Siles (Jun 12 2020 at 12:53):

@Karl Palmskog are these workshop recorded somehow ?

view this post on Zulip Karl Palmskog (Jun 12 2020 at 12:54):

@Vincent Siles I certainly hope they will be recorded since it's all online, but hopefully @Théo Zimmermann can confirm

view this post on Zulip Théo Zimmermann (Jun 12 2020 at 12:59):

I certainly hope so too but I haven't had any clear information confirming this yet.

view this post on Zulip Théo Zimmermann (Jun 12 2020 at 12:59):

Note that the workshop will be free to attend and registration is already open on the IJCAR-FSCD 2020 website.

view this post on Zulip Karl Palmskog (Jun 12 2020 at 14:27):

@Vincent Siles so basically you want a sort of normal form at the Gallina code level? I don't think you want to process elaborated terms, right? For the record, we released our Coq dataset we used for machine learning recently, which has tokens+AST(with notations)+terms(for lemma statements only), all as sexps as generated by SerAPI: https://github.com/EngineeringSoftware/math-comp-corpus

view this post on Zulip Vincent Siles (Jun 12 2020 at 15:09):

Yes something like that: not sure what elaborated terms is but we want to process user level language (so the ML model learn thing we can reinject into Coq) without the complexity of users notations. Simple example would be plus which is an alias to Nat.add. It might be called addition somewhere else, and this brings noise.

view this post on Zulip Karl Palmskog (Jun 12 2020 at 15:22):

elaborated terms are what Emilio means when he talks about terms being in "kernel form" here: https://github.com/ejgallego/coq-serapi/blob/v8.11/FAQ.md#interpretation-and-terms -- what is processed by Coq's type-checker/kernel

view this post on Zulip Vincent Siles (Jun 15 2020 at 12:33):

Thank you for pointing this out. Then it looks like we are looking for a glob_constr, or an intermediate between constr_expr and glob_constr (like Notation interpretation, maybe implicits / desugar, not sure)


Last updated: Oct 13 2024 at 01:02 UTC