This is the topic to ask questions to the panelists on domain-agnostic and domain-specific proof automation.
Panel starting in 30 mins!
Just started!
Here is the link to SMTCoq: https://smtcoq.github.io
To Sophie, you mention that superposition is better for non-ground terms. Is it worse for ground ones? Or is it just that SMT solvers are more engineered?
Or how do they relate?
As far as I can tell, SMT solvers are better when built-in theories are on board
For sure
The approach of SMT is also generally better when the quantification is very shallow.
@Chantal Keller Q: Easycrypt uses why3 to interface with smt solvers. I know smtcoq gives verified proofs. Still do you see a useful connection with why3 ?
Question for @Janno @Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.
@Beta Ziliani There are two competition where you can compare the performances of superposition and smt: casc, that is more superposition oriented ( http://www.tptp.org/CASC/27/WWWFiles/DivisionSummary1.html ) and the SMT competition ( https://smt-comp.github.io/2020/results.html ), where CVC4 (SMT) and Vampire (superposition) have a go at the other competition.
Both are among the bests of their techniques. Vampire dominates the superposition world (at least in FOL) and afaik CVC4 shares the glory with a few others like z3, yices2...
@Jason Gross mentioned tools he developed that are more efficient. Are those available?
Enrico Tassi said:
Question for Janno Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.
We have a few examples in the repository itself but we haven't really any big developments. We have the case study from the Mtac2 paper but that's far from big. It did clearly indicate though that the static types help both in debugging and also with performance. I'll compile a list of instructive examples after the panel.
Q: related to Sophie's second answer: understanding what are the important symbols, is this related to domain expertise, or is it rather an expertise on automated proof?
@Enrico Tassi I'll try to keep it short. There are two uses of types. For almost full-certification of a tactic, we don't have small examples, since types usually help in big developments. In our Mtac2 paper we show its use in Iris.
Now, types also help avoiding stupid errors, like when pattern-matching a term (forward tactics):
Ltac example n :=
match n with
| I can put almost anything here => ...
end.
For reflective proofs, this is uses quite a lot and it's a real pain in untyped languages.
Especially useful would be an example of a two or three common automation tasks in Ltac2, Mtac2, MetaCoq and as OCaml plugin.
Michael Soegtrop said:
Especially useful would be an example of a two or three common automation tasks in Ltac2, Mtac2, MetaCoq and as OCaml plugin.
And Elpi?
Q: If I where to start developing a reflective decision procedure today, what' the current best approach to use for reification (i.e., converting a given goal to a deeply embedded term language and the map from the variables in said term to the abstracted parts of the goal).
@Richard L Ford The reflective rewriting tool is at https://github.com/mit-plv/rewriter. It unfortunately currently has a number of limitations (no dependent types, hard-coded container types), so while it's sufficient for fiat-crypto, it's unfortunately more of a research prototype than a tool ready for wide use. (The paper on it is currently being submitted to POPL.)
The faster way of doing reification is much more widely applicable and is available at http://adam.chlipala.net/papers/ReificationITP18/
Yves Bertot said:
Q: related to Sophie's second answer: understanding what are the important symbols, is this related to domain expertise, or is it rather an expertise on automated proof?
Domain expertise can certainly help in particular cases. On the automatic provers' side, the best precedence of symbols that we know of is related to how often the symbols appear (in reverse order).
@Sophie Tourret , thanks.
Some researchers are also trying to use machine learning to find more domain-specific precedences of symbols.
Bas Spitters said:
Chantal Keller Q: Easycrypt uses why3 to interface with smt solvers. I know smtcoq gives verified proofs. Still do you see a useful connection with why3 ?
Small-step transformations to transform Coq goals into first-order logic is inspired from Why3's approach. We are currently building a framework to provide different ways to certify them.
@Chantal Keller So, could one expect a verified version of that part of why3?
I believe some similar unverified transformations are going on in F*, isn't it?
Yes, the plan is to provide certified or certifying transformations for Why3 as well (a branch of Why3 already provides simple such transformations)
F* is more like a hammer, the encoding is a big one step transformation, hard to certifiy
@Chantal Keller Thanks!
And Elpi?
Yes, please. Would it help to state a few example problems?
I could also contribute the Ltac2 solutions
Yes. I guess someone has to take the lead and act as the hub, i.e. collect problems and solutions
@Chantal Keller , there are some input/output/system API in Elpi (standard \lambdaProlog API, nothing Coq specific) documented here https://github.com/LPCIC/coq-elpi/blob/master/elpi-builtin.elpi#L1079 .
I'm happy adding more, but I'd like to know what you really need. The corresponding OCaml functions would be a great pointer (binding them to Elpi is easy).
@Enrico Tassi : OK, I will set something up.
Enrico Tassi said:
Chantal Keller , there are some input/output/system API in Elpi (standard \lambdaProlog API, nothing Coq specific) documented here https://github.com/LPCIC/coq-elpi/blob/master/elpi-builtin.elpi#L1079 .
I'm happy adding more, but I'd like to know what you really need. The corresponding OCaml functions would be a great pointer (binding them to Elpi is easy).
Cool, I will look at it! Something that I also need is parsing...
@Chantal Keller do you have control on the format? (can it be converted to "lisp" beforehand?)
I mean, parsing is the job of yacc & co, not of Elpi or Ltac or ... you name it. It can be surely done, but looks like it's not the right tool for that.
Sophie Tourret thanks for the pointer, I will look at the papers from PAAR, then.
I was refering to: Filip Bártek and Martin Suda, "Learning Precedences from Simple Symbol Features". You can find it here: http://paar2020.gforge.inria.fr/
thx!
@Enrico Tassi I rather plan to remove parsing as much as possible in the future, by doing I/O through APIs
Chantal Keller said:
Enrico Tassi I rather plan to remove parsing as much as possible in the future, by doing I/O through APIs
Looks like a sound idea ;-)
Janno said:
Enrico Tassi said:
Question for Janno Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.
We have a few examples in the repository itself but we haven't really any big developments. We have the case study from the Mtac2 paper but that's far from big. It did clearly indicate though that the static types help both in debugging and also with performance. I'll compile a list of instructive examples after the panel.
It turns out to be rather hard to find Mtac2 that is in any way easy to digest (owing mostly to the fact that a lot of it is written by me and I am usually up to no good, i.e. trying to avoid any possible use of non-dependent types!). I'd love to share the code from my work at BedRock but sadly that is not open source at this point. It has comments and all that extra stuff you don't get in research-grade prototypes... I hope to be able to share it some day.
So at this point I am afraid the best thing to look at is still the iris case study. The code is bitrotting quite quickly but just looking at Mtac2 and Ltac code side by side might already give people an idea of what is possible. Consider this tactic called wp_pure
implemented in Mtac2, for example:
https://gitlab.mpi-sws.org/janno/iris-coq/-/blob/mtac2-tt/theories/heap_lang/proofmode.v#L126
The Ltac code is right below it and you'll notice that they look very similar. However, the Mtac2 code has long list of advantages over the Ltac code:
tac_wp_pure
) is changed in a way that would break the tactic (or, really, if anything used inside the tactic changes in such a way)IHeapGoal
record being destructed in the first line)TryNextDecomposition
exception) to drive backtracking very explicitly, something that is impossible to do in Ltac in a modular way.There are some caveats to what I list above, such as the fact that the Mtac2 version uses a helper function to match on the right shape of goals or the fact that the Mtac2 version of reshape_expr
(also used in the tactic) is certifying and thus harder to implement than the Ltac version. And probably some other bits.. I would count most of these as general overhead of typed versus untyped languages and you can always opt to go back to less-statically typed code. (Also, while the Mtac2 version executes faster than the Ltac version, there is currently a bug that makes Coq slower overall when importing the Mtac2 version. This should be fixed soon: https://github.com/coq/coq/pull/12449.)
Michael Soegtrop said:
Yes, please. Would it help to state a few example problems?
Among the examples that I would like to see would be some variants of reification, starting from something really simple (e.g., propositional logic) and then maybe something with binders.
@Christian Doczkal : can you come up with an example of what you have in mind? I will setup a github project to collect the examples and solutions.
@Michael Soegtrop @Christian Doczkal the example for reification I have in elpi is this one: https://github.com/LPCIC/coq-elpi/blob/master/theories/examples/example_reflexive_tactic.v
It is the classic monoid example. It has no proper binder is the reified language, but has variables and an environment (that is a binder for the expressions that are not concrete values of the reified language).
@Christian Doczkal do you think having proper binders in the reified language changes things substantially?
@Christian Doczkal : do you have a better example than @Enrico Tassi provided or can you adjust it to what you want? I can then translate it into Ltac2.
@Michael Soegtrop , monoid equalities are just as good a first example as propositional logic, and I don't have a better example to point to. A bachelor's student of mine once did a reflective solver for boolean tautologies, but the code is part of his thesis development.
@Christian Doczkal : fine, I will see if I can come up with a Ltac2 version of this example and ask others to do a Mtac2 and a MetaCoq version.
A more advanced example might be something like the ra_fold
tactic of the relation-algebra
library developed by Damien Pous: https://github.com/damien-pous/relation-algebra/blob/master/ra_fold.mlg
The tactic takes a structure (i.e.., an instance of a Kleene Algebra) and tries to find the operations of said algebra. For instance, when given the "heterogeneous relations over fintype" instance, it will convert connect e =2 connect e'
to e^* ≡ e'^*
The full tactic implementation is pretty long, due to the large number of operations, but maybe discovering monoid operations up to conversion would be a nice example too.
@Enrico Tassi , I had a look at your reflexive example. Since I am not used to the Prolog way of thinking, I found the mem
/close
combination quite insightful. The first clause of mem
has a non-linear pattern:
mem {{ lp:X :: _ }} X {{ O }} :- !.
Does this check for syntactic equality, or does this check for equality up to conversion? That is, what's the "result" or "behavior" of mem (3 :: L) (0+3) N
@Christian Doczkal that's a technical eye!
The notion of equality of Elpi corresponds to Coq's syntactic one (it is up to \beta\eta but for the \lambdaProlog function space, not the Coq one) on ground terms (on terms with holes it is essentially pattern-fragment unification, a sort of first order like unification).
On one hand Elpi is not Coq specific so it's equality must make sense on its own, on the other hand no matter what you code you need 2 (or more) equality tests, if only for performance and predictability. If you want to compare Coq terms using Coq's unification you can call coq.unify-eq (or coq.unify-leq). In that case you would have to write mem
differently, and the pattern would become linear. Something like mem {{ lp:X :: _ }} Y {{ 0 }} :- coq.unify-eq X Y ok, !.
should work.
Wrt the way I build the "reification map", I did it that way for two reasons. First it is the cheapest one in logic programming, and it's kind of cool ;-). Second, it is what you would also do with type classes (which is logic programming in disguise). It is clearly not very efficient, one could use an (Elpi, or even OCaml) balanced tree during reification and convert it back to a Coq list at the very end, but I believe that it would be too much for an example like that. More in general my take is that if you really really need speed in meta level code, you write it in OCaml (you may prototype it in your favorite extension language... but that is all).
@Enrico Tassi , thank you very much for the detailed explanation. I'm not worried about the speed of reification. The decision procedures I would be interested in are all worst-case exponential, so if reification were to take noticeable time, the decision procedure would time out anyways.
Here are several Mtac2 versions of the elpi code: https://gist.github.com/Janno/b0556cca413568096a27192dfcd3ac82 (now updated to remove some outdated comments). Version 2 (Mtac_V2
in the code) corresponds most closely to the elpi code, I think. The last version, Mtac_V3
uses a certifying reification metaprogram and typed tactics to avoid the unification cost of calling the change
tactic. (It's also just awesome to know that the reification tactic is actually correct and not just coincidentally correct on whatever limited examples one has tried.)
The full tactic implementation is pretty long, due to the large number of operations, but maybe discovering monoid operations up to conversion would be a nice example too.
The way I implemented things in Mtac2 above one can actually fine-tune the amount of conversion that is performed when looking for the monoid operations by using different mmatch
branche types. =u>
and =c>
perform full unification (UniCoq and Coq's own Evarconv, respectively) with reduction,=n>
performs no reduction at all, and the default =>
performs reduction only in the discriminee, not in the pattern (and it will not instantiate evars in the discriminee)
This seems unnecessary to me https://gist.github.com/Janno/b0556cca413568096a27192dfcd3ac82#file-elpi-reification-port-v-L212 . In the examples I have it is always the case that expressions outside the syntax are variables, but could be anything.
Ah, I wasn't quite sure about that part so I implemented something conservative. The code will not break when removing the check but using the list for non-var terms opens up a few question regarding the lookup function (do we search modulo conversion, for example).
I removed the check. Without examples it's hard to say how exactly to best adapt the lookup function(s) so I kept them as they are right now—which probably entails some difference V1 and V2&V3 since they use different mmatch
patterns.
This is related to the simplification rules. In a monoid you don't have rules like forall x, x - x = 0
. In that case it would be more important to reify equals as equals, and for that you would search in the list up to conversion/unification rather than syntactically as I do in my code. I guess it is anyway interesting to point that out in the example and show how hard it would be to search up to...
More in general my take is that if you really really need speed in meta level code, you write it in OCaml
@Enrico Tassi I wanted to come back to this but it took me a while to figure out how to reply. I think if what you say is true then we've got a huge problem. Writing Coq plugins that actually work is extremely hard. I don't think I've tried a single (out-of-tree) plugin that didn't break pretty much immediately when I tried it. They all have various restrictions on what kind of terms they operate on, what environments they can be called from (Section
tend to be problematic, for example), on which Coq settings they can tolerate/make use of (universe polymorphism, for example), etc.
And as a developer of a Coq plugin I can say that I would never, ever recommend anybody to write a Coq plugin unless it's absolutely necessary or the scope of the plugin is definable, defined, and very small. I think my response here may be slightly less extreme in the future since there is a lot of work going on that tries to make it easier to develop plugins, package plugins, deploy them on various platforms, etc. But I don't think it will ever reach a point where I'd consider it anything but the very last resort.
There is some truth to what you say, of course, which is that manipulating ASTs in OCaml will always be at least a bit faster than doing it through some kind of abstraction. I don't remember ever seeing an example of this really being the bottleneck, though. In my experience, typechecking is usually a much bigger factor, and so is actual computation (in reflective proofs, at least). But that's just my own experience.. Maybe we should start collecting examples of different kinds of bottlenecks so that we can have a more concrete discussion about this.
From outside, I agree with @Janno; using DSLs and high-level languages is one of the lessons from programming language research.
An expressive _and efficient_ high-level (metaprogramming) language can give _leverage_ to users, since it allows them to work faster. But if performance isn't sufficient for their use, the benefit is more limited.
Of course, the performance doesn't need to be there from the start, but you need a chance to get it later — which sometimes means designing for performance from the start.
AFAIU, Lean 4's design seems better here — they invested in making Lean efficient enough to implement most of Lean in it, to enable customizing as much as possible without needing native extensions. That's a bet, but it might enable the non-kernel code to evolve (and be customized) much faster.
Coq is moving in that direction while preserving compatibility, and that's great, but let's understand the goal and the tradeoffs.
@Janno I guess my message was not very clear. I do like high level domain specific languages for customizing Coq and I recommend using this class of languages to extend Coq. But I also don't want to hide the fact that "high level" rarely comes for free (at least for Elpi).
To be concrete, if you take the MetaCoq or Ltac2 datatype for terms it is very similar to the ML one, and indeed you could (as Lean does) compile meta programs to the host language achieving almost identical performances (we are not there yet today, AFAIK). At least in the context of Coq, that is implemented in ML, I don't consider this approach "high level", since you could write almost the same code in ML (there are other cool things about MetaCoq, like the possibility to reason about the meta program, but you may not need it).
Elpi and Mtac2 are, to my eyes, more "high level" than that. For example they represent terms in such a way that variables are not integers, and that feature comes at a cost. Similarly they provide language primitives like unification that is necessarily more expensive than the syntactic matching you get in ML. And don't get me wrong, I do believe it is totally worth it.
When I wrote my message I just wanted to be honest with the users (of Elpi) and I couldn't claim that Elpi is as fast as ML, or that it will be. Hence my suggestion to use ML if one absolutely needs speed, at least as things stand today.
About the cost of type checking being dominant on anything else, It may be true and unavoidable in some cases (eg reflexive tactics), but in other ones it's just the lack of control that you have that make it expensive. For example it's very hard in Ltac1 (even with the "new" uconstr data type) to control when (read how many times) your terms are type checked. In ML it is simple, and a common mistake it actually to forget type checking your terms ;-) For example, last time I checked, the odd order theorem spends 1/3 time in type checking finished proof terms (Qed time) and 2/3 building the terms via tactics (written in ML). Given these numbers, I'm pretty sure that the ssr tactics don't type check terms too many times.
Last updated: Sep 25 2023 at 14:01 UTC