@Pierre-Marie Pédrot : I am porting my Ltac2 code to 8.13. CompCert 3.8 introduced a new notation using $. This masks the trivial anti quotation syntax of Ltac2. Closing the scope of the CompCert notation doesn't help. Here is an example:
Require Import Ltac2.Ltac2.
Require Import compcert.exportclight.Clightdefs.
Locate "$".
Close Scope string_scope.
Notation "[ x ]" := ltac2:(let x := Ltac2.Constr.pretype x in exact $x) (only parsing).
Is there anything I can do besides not Importing (just requiring) the CompCert file or defining my own notation for Ltac2 anti quotations?
Btw.: Should we have a Ltac2 Zulip stream?
I will never get tired of repeating this: "syntax is not modular".
I believe we should check how CompCert defines this notation and see whether there is a way to factorize it somehow
Unrelatedly I am surprised we don't already have an Ltac2 stream.
The CompCert definition is here (https://github.com/AbsInt/CompCert/blob/v3.8/exportclight/Clightdefs.v#L180).
Did you try reimporting the Ltac2 module? with a bit of luck it'll override the CompCert notation.
I will never get tired of repeating this: "syntax is not modular".
I think one could make it more modular. There have been some discussions 2 years back on the CUDW. I still do believe that it would be possible to have a parser which ignores productions which are not in the current parsing scope.
The problem here is likely to be the lexer, not the parser.
Did you try reimporting the Ltac2 module? with a bit of luck it'll override the CompCert notation.
Yes I tried that - it doesn't work.
There is a global table of tokens that is messed with upon declaring new notations.
I think you're out of luck then. Maybe there is a way to make the Ltac2 parsing rule more robust but then it'd require a patch to Coq.
There is a global table of tokens that is messed with upon declaring new notations.
Yes, but I think for the lexer it would be even easier to tell it to not to create tokens which are not in scope. It is a state machine and one could dynamically enable / disable certain transitions in this state machine.
I agree that in the abstract there might be a way to do it, but given the way it's currently implemented...
We have dedicated hacks to work around space-insensitivity at the level of tokens for instance.
In particular, IIRC Ltac2 quotation is written in a way that $ x
doesn't parse, you have to be space-free.
Yes, that's the problem. My statement was more that "Syntax ix not modular" is more related to the implementation than to what is possible/terribly difficult in general.
Syntax is still part of a global state, so importing two modules that declare conflicting rules will explode midflight, whatever you do to try to delimit the fallout area
Didn't we have a CEP to remove notations at some point?
I agree that having two incompatible notations in Scope simultaneously easily leads to an explosive. But as I said, I don't see why it shouldn't be possible to put the lexer/parser into a state which is compatible with what it was before a notation has been defined on a Close Scope
.
Because you still have the scope delimiters (...)%foo
so parsing needs to be performed even if you don't know what to do with the AST yet
Indeed, this would only work if scope delimiters would be prefix, not postfix.
more that just prefix, you need a rich system for scoping parsing extensions
the fact that the quotation marker is postfix is just there to dissipate any hope you might have had
You mean a two level parser? A first level which just takes care of scopes and keeping the proper context for a second level parser?
rather, n-level, or able to dynamically extend the parser with arbitrary rules in a well-scoped way and with arbitrary depth
not that it's not impossible technically, but once again this means changing completely how parsing is performed in Coq
Well the 1st level parser would have a stack - it is a parser ...
Coq syntax goes beyond context-free grammars I am afraid
not that it's not impossible technically, but once again this means changing completely how parsing is performed in Coq
Yes as I keep saying - it is a question of implementation, not of possibility.
Right, right. But it's no small feat, and expect trouble with backwards compat.
Coq syntax goes beyond context-free grammars I am afraid
The question is if the syntax of scope annotations goes beyond that.
How do you parse what's inside ?
i.e. how do you know some crazy user has not written a parsing rule conflicting with scope delimiters?
and like, you need to be able to perform a weak form of tokenization anyways
the only sane solution to this problem are HEREDOC strings
An easy solution would be to forbid parsing rules which have unbalanced round parenthesis.
Not enough, because you can have parentheses in strings.
So you need to keep track of string opening as well, but this one is not well-balanced.
And since you can get double quotes in random notations...
Ah yes, I am used to lexer level strings. Indeed this qould get more tricky.
Likely worth a Ph.D.
Anyway, I don't have that many places where I need Ltac2 and CompCert together. I won't die if I use the long form anti quotation there.
Note that it incurs a runtime penalty.
$x
does less checks than wrapping the term inside a ltac2 and calling exact or something.
You mean if I use long form anti quotation?
yep
Do you have a suggestion?
No, but it's probably not worth it in practical use cases.
Just don't write your critical tactics with CompCert in scope.
That's also my expectation.
Well at least not with this file in scope.
Thanks for your help and the good discussion!
Pierre-Marie Pédrot said:
Unrelatedly I am surprised we don't already have an Ltac2 stream.
It's easy to create one if you want. You can do that yourself. But you will need me if you want to change the name or the description after the stream has been created.
@Michael Soegtrop
Indeed, this would only work if scope delimiters would be prefix, not postfix.
If you think foo%(...)
would help, aren't you also forgetting that scope foo
is used even when closed _and_ without foo%
because of scopes?
anyway, before you redesign the system from scratch, you might want to look at the research on modular syntax extensions. As far as more principled solutions go, Eelco Visser's parsing tech is almost what you need (modulo fast parse table update on extensions), and/or scannerless GLL might be ideal (tho it's between less mature and a research topic).
What Lean is doing is less principled, but better than what Coq does. But I'm not sure how much experience they have on modularity...
(and if you can coordinate extensions, like it's possible in Lean's math-lib, parsing is actually a solved problem; the hard part is just modular parsing)
@Paolo Giarrusso, @Pierre-Marie Pédrot : I thought over this a bit more - in the end I think it is rarely required that one needs to mix incompatible syntax in one term. The problem is that one drags in a module, which drags in 100 other modules one if which defines a syntax which clashes with a syntax I need, but actually I don't need the syntax defined by that module - usually no where in my document and almost always at least not in one "statement". So I think being able to entirely switch off a scope including scope delimiters and argument scope would be sufficient to fix most of the problems people have. Something like a "Hard Close Scope xyz". It would be fine with me if symbols which parse arguments using xyz give an error then.
Thanks for the pointers to recent research on the topic!
For that, you might want people to avoid lonely notations, and maybe to have more fine-grained scopes. And/or a way to disable individual notations... :-|
Last updated: Sep 28 2023 at 10:01 UTC