Stream: Coq workshop 2020

Topic: [M] Panel on automation


view this post on Zulip Théo Zimmermann (Jun 24 2020 at 11:15):

This is the topic to ask questions to the panelists on domain-agnostic and domain-specific proof automation.

view this post on Zulip Emilio Jesús Gallego Arias (Jul 06 2020 at 13:27):

Panel starting in 30 mins!

view this post on Zulip Emilio Jesús Gallego Arias (Jul 06 2020 at 14:01):

Just started!

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:11):

Here is the link to SMTCoq: https://smtcoq.github.io

view this post on Zulip Beta Ziliani (Jul 06 2020 at 14:11):

To Sophie, you mention that superposition is better for non-ground terms. Is it worse for ground ones? Or is it just that SMT solvers are more engineered?
Or how do they relate?

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:13):

As far as I can tell, SMT solvers are better when built-in theories are on board

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:13):

For sure

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:13):

The approach of SMT is also generally better when the quantification is very shallow.

view this post on Zulip Bas Spitters (Jul 06 2020 at 14:18):

@Chantal Keller Q: Easycrypt uses why3 to interface with smt solvers. I know smtcoq gives verified proofs. Still do you see a useful connection with why3 ?

view this post on Zulip Enrico Tassi (Jul 06 2020 at 14:20):

Question for @Janno @Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:23):

@Beta Ziliani There are two competition where you can compare the performances of superposition and smt: casc, that is more superposition oriented ( http://www.tptp.org/CASC/27/WWWFiles/DivisionSummary1.html ) and the SMT competition ( https://smt-comp.github.io/2020/results.html ), where CVC4 (SMT) and Vampire (superposition) have a go at the other competition.

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:25):

Both are among the bests of their techniques. Vampire dominates the superposition world (at least in FOL) and afaik CVC4 shares the glory with a few others like z3, yices2...

view this post on Zulip Richard L Ford (Jul 06 2020 at 14:29):

@Jason Gross mentioned tools he developed that are more efficient. Are those available?

view this post on Zulip Janno (Jul 06 2020 at 14:29):

Enrico Tassi said:

Question for Janno Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.

We have a few examples in the repository itself but we haven't really any big developments. We have the case study from the Mtac2 paper but that's far from big. It did clearly indicate though that the static types help both in debugging and also with performance. I'll compile a list of instructive examples after the panel.

view this post on Zulip Yves Bertot (Jul 06 2020 at 14:29):

Q: related to Sophie's second answer: understanding what are the important symbols, is this related to domain expertise, or is it rather an expertise on automated proof?

view this post on Zulip Beta Ziliani (Jul 06 2020 at 14:30):

@Enrico Tassi I'll try to keep it short. There are two uses of types. For almost full-certification of a tactic, we don't have small examples, since types usually help in big developments. In our Mtac2 paper we show its use in Iris.

Now, types also help avoiding stupid errors, like when pattern-matching a term (forward tactics):

Ltac example n :=
   match n with
   | I can put almost anything here => ...
   end.

For reflective proofs, this is uses quite a lot and it's a real pain in untyped languages.

view this post on Zulip Michael Soegtrop (Jul 06 2020 at 14:30):

Especially useful would be an example of a two or three common automation tasks in Ltac2, Mtac2, MetaCoq and as OCaml plugin.

view this post on Zulip Cyril Cohen (Jul 06 2020 at 14:31):

Michael Soegtrop said:

Especially useful would be an example of a two or three common automation tasks in Ltac2, Mtac2, MetaCoq and as OCaml plugin.

And Elpi?

view this post on Zulip Christian Doczkal (Jul 06 2020 at 14:31):

Q: If I where to start developing a reflective decision procedure today, what' the current best approach to use for reification (i.e., converting a given goal to a deeply embedded term language and the map from the variables in said term to the abstracted parts of the goal).

view this post on Zulip Jason Gross (Jul 06 2020 at 14:32):

@Richard L Ford The reflective rewriting tool is at https://github.com/mit-plv/rewriter. It unfortunately currently has a number of limitations (no dependent types, hard-coded container types), so while it's sufficient for fiat-crypto, it's unfortunately more of a research prototype than a tool ready for wide use. (The paper on it is currently being submitted to POPL.)
The faster way of doing reification is much more widely applicable and is available at http://adam.chlipala.net/papers/ReificationITP18/

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:32):

Yves Bertot said:

Q: related to Sophie's second answer: understanding what are the important symbols, is this related to domain expertise, or is it rather an expertise on automated proof?

Domain expertise can certainly help in particular cases. On the automatic provers' side, the best precedence of symbols that we know of is related to how often the symbols appear (in reverse order).

view this post on Zulip Yves Bertot (Jul 06 2020 at 14:33):

@Sophie Tourret , thanks.

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:33):

Some researchers are also trying to use machine learning to find more domain-specific precedences of symbols.

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:34):

Bas Spitters said:

Chantal Keller Q: Easycrypt uses why3 to interface with smt solvers. I know smtcoq gives verified proofs. Still do you see a useful connection with why3 ?

Small-step transformations to transform Coq goals into first-order logic is inspired from Why3's approach. We are currently building a framework to provide different ways to certify them.

view this post on Zulip Bas Spitters (Jul 06 2020 at 14:37):

@Chantal Keller So, could one expect a verified version of that part of why3?
I believe some similar unverified transformations are going on in F*, isn't it?

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:38):

Yes, the plan is to provide certified or certifying transformations for Why3 as well (a branch of Why3 already provides simple such transformations)

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:39):

F* is more like a hammer, the encoding is a big one step transformation, hard to certifiy

view this post on Zulip Bas Spitters (Jul 06 2020 at 14:39):

@Chantal Keller Thanks!

view this post on Zulip Michael Soegtrop (Jul 06 2020 at 14:40):

And Elpi?

Yes, please. Would it help to state a few example problems?

view this post on Zulip Michael Soegtrop (Jul 06 2020 at 14:41):

I could also contribute the Ltac2 solutions

view this post on Zulip Enrico Tassi (Jul 06 2020 at 14:42):

Yes. I guess someone has to take the lead and act as the hub, i.e. collect problems and solutions

view this post on Zulip Enrico Tassi (Jul 06 2020 at 14:47):

@Chantal Keller , there are some input/output/system API in Elpi (standard \lambdaProlog API, nothing Coq specific) documented here https://github.com/LPCIC/coq-elpi/blob/master/elpi-builtin.elpi#L1079 .
I'm happy adding more, but I'd like to know what you really need. The corresponding OCaml functions would be a great pointer (binding them to Elpi is easy).

view this post on Zulip Michael Soegtrop (Jul 06 2020 at 14:47):

@Enrico Tassi : OK, I will set something up.

view this post on Zulip Chantal Keller (Jul 06 2020 at 14:52):

Enrico Tassi said:

Chantal Keller , there are some input/output/system API in Elpi (standard \lambdaProlog API, nothing Coq specific) documented here https://github.com/LPCIC/coq-elpi/blob/master/elpi-builtin.elpi#L1079 .
I'm happy adding more, but I'd like to know what you really need. The corresponding OCaml functions would be a great pointer (binding them to Elpi is easy).

Cool, I will look at it! Something that I also need is parsing...

view this post on Zulip Enrico Tassi (Jul 06 2020 at 14:54):

@Chantal Keller do you have control on the format? (can it be converted to "lisp" beforehand?)

view this post on Zulip Enrico Tassi (Jul 06 2020 at 14:55):

I mean, parsing is the job of yacc & co, not of Elpi or Ltac or ... you name it. It can be surely done, but looks like it's not the right tool for that.

view this post on Zulip Yves Bertot (Jul 06 2020 at 14:57):

Sophie Tourret thanks for the pointer, I will look at the papers from PAAR, then.

view this post on Zulip Sophie Tourret (Jul 06 2020 at 14:59):

I was refering to: Filip Bártek and Martin Suda, "Learning Precedences from Simple Symbol Features". You can find it here: http://paar2020.gforge.inria.fr/

view this post on Zulip Yves Bertot (Jul 06 2020 at 15:01):

thx!

view this post on Zulip Chantal Keller (Jul 06 2020 at 15:02):

@Enrico Tassi I rather plan to remove parsing as much as possible in the future, by doing I/O through APIs

view this post on Zulip Enrico Tassi (Jul 06 2020 at 15:02):

Chantal Keller said:

Enrico Tassi I rather plan to remove parsing as much as possible in the future, by doing I/O through APIs

Looks like a sound idea ;-)

view this post on Zulip Janno (Jul 06 2020 at 16:13):

Janno said:

Enrico Tassi said:

Question for Janno Beta Ziliani : are there good examples of MTac2 code we can look at? I'd love examples in which the rich types of tactics makes MTac2 shine.

We have a few examples in the repository itself but we haven't really any big developments. We have the case study from the Mtac2 paper but that's far from big. It did clearly indicate though that the static types help both in debugging and also with performance. I'll compile a list of instructive examples after the panel.

It turns out to be rather hard to find Mtac2 that is in any way easy to digest (owing mostly to the fact that a lot of it is written by me and I am usually up to no good, i.e. trying to avoid any possible use of non-dependent types!). I'd love to share the code from my work at BedRock but sadly that is not open source at this point. It has comments and all that extra stuff you don't get in research-grade prototypes... I hope to be able to share it some day.
So at this point I am afraid the best thing to look at is still the iris case study. The code is bitrotting quite quickly but just looking at Mtac2 and Ltac code side by side might already give people an idea of what is possible. Consider this tactic called wp_pure implemented in Mtac2, for example:
https://gitlab.mpi-sws.org/janno/iris-coq/-/blob/mtac2-tt/theories/heap_lang/proofmode.v#L126
The Ltac code is right below it and you'll notice that they look very similar. However, the Mtac2 code has long list of advantages over the Ltac code:

  1. It's statically typed and will break at definition time if the signature of the core lemma (somewhat confusingly called tac_wp_pure) is changed in a way that would break the tactic (or, really, if anything used inside the tactic changes in such a way)
  2. This guarantee covers the question of whether the lemma solves the right goal (implicitly described by the pieces inside the IHeapGoal record being destructed in the first line)
  3. .. and it extends to the direct arguments given to the lemma
  4. .. and it also extends to the follow-up tactics applied to the subgoals generated by the application of that lemma.
  5. The tactic avoids unecessary backtracking by using different exceptions (in this case the TryNextDecomposition exception) to drive backtracking very explicitly, something that is impossible to do in Ltac in a modular way.
  6. It is actually faster than the ltac1 version since it avoids several costly calls to unification by exploiting static types.
  7. It does all of that in a 100% type safe way so it won't mess up your proof state, despite offering performance that the ltac version could only reach (without using a completely different approach) by using unsafe tactics.

There are some caveats to what I list above, such as the fact that the Mtac2 version uses a helper function to match on the right shape of goals or the fact that the Mtac2 version of reshape_expr (also used in the tactic) is certifying and thus harder to implement than the Ltac version. And probably some other bits.. I would count most of these as general overhead of typed versus untyped languages and you can always opt to go back to less-statically typed code. (Also, while the Mtac2 version executes faster than the Ltac version, there is currently a bug that makes Coq slower overall when importing the Mtac2 version. This should be fixed soon: https://github.com/coq/coq/pull/12449.)

view this post on Zulip Christian Doczkal (Jul 06 2020 at 16:26):

Michael Soegtrop said:

Yes, please. Would it help to state a few example problems?

Among the examples that I would like to see would be some variants of reification, starting from something really simple (e.g., propositional logic) and then maybe something with binders.

view this post on Zulip Michael Soegtrop (Jul 06 2020 at 17:08):

@Christian Doczkal : can you come up with an example of what you have in mind? I will setup a github project to collect the examples and solutions.

view this post on Zulip Enrico Tassi (Jul 06 2020 at 17:47):

@Michael Soegtrop @Christian Doczkal the example for reification I have in elpi is this one: https://github.com/LPCIC/coq-elpi/blob/master/theories/examples/example_reflexive_tactic.v
It is the classic monoid example. It has no proper binder is the reified language, but has variables and an environment (that is a binder for the expressions that are not concrete values of the reified language).

@Christian Doczkal do you think having proper binders in the reified language changes things substantially?

view this post on Zulip Michael Soegtrop (Jul 07 2020 at 06:25):

@Christian Doczkal : do you have a better example than @Enrico Tassi provided or can you adjust it to what you want? I can then translate it into Ltac2.

view this post on Zulip Christian Doczkal (Jul 07 2020 at 07:02):

@Michael Soegtrop , monoid equalities are just as good a first example as propositional logic, and I don't have a better example to point to. A bachelor's student of mine once did a reflective solver for boolean tautologies, but the code is part of his thesis development.

view this post on Zulip Michael Soegtrop (Jul 07 2020 at 07:04):

@Christian Doczkal : fine, I will see if I can come up with a Ltac2 version of this example and ask others to do a Mtac2 and a MetaCoq version.

view this post on Zulip Christian Doczkal (Jul 07 2020 at 07:23):

A more advanced example might be something like the ra_fold tactic of the relation-algebra library developed by Damien Pous: https://github.com/damien-pous/relation-algebra/blob/master/ra_fold.mlg
The tactic takes a structure (i.e.., an instance of a Kleene Algebra) and tries to find the operations of said algebra. For instance, when given the "heterogeneous relations over fintype" instance, it will convert connect e =2 connect e' to e^* ≡ e'^*

The full tactic implementation is pretty long, due to the large number of operations, but maybe discovering monoid operations up to conversion would be a nice example too.

view this post on Zulip Christian Doczkal (Jul 07 2020 at 07:33):

@Enrico Tassi , I had a look at your reflexive example. Since I am not used to the Prolog way of thinking, I found the mem/close combination quite insightful. The first clause of mem has a non-linear pattern:

mem {{ lp:X :: _     }} X {{ O      }} :- !.

Does this check for syntactic equality, or does this check for equality up to conversion? That is, what's the "result" or "behavior" of mem (3 :: L) (0+3) N

view this post on Zulip Enrico Tassi (Jul 07 2020 at 08:15):

@Christian Doczkal that's a technical eye!
The notion of equality of Elpi corresponds to Coq's syntactic one (it is up to \beta\eta but for the \lambdaProlog function space, not the Coq one) on ground terms (on terms with holes it is essentially pattern-fragment unification, a sort of first order like unification).

On one hand Elpi is not Coq specific so it's equality must make sense on its own, on the other hand no matter what you code you need 2 (or more) equality tests, if only for performance and predictability. If you want to compare Coq terms using Coq's unification you can call coq.unify-eq (or coq.unify-leq). In that case you would have to write mem differently, and the pattern would become linear. Something like mem {{ lp:X :: _ }} Y {{ 0 }} :- coq.unify-eq X Y ok, !. should work.

Wrt the way I build the "reification map", I did it that way for two reasons. First it is the cheapest one in logic programming, and it's kind of cool ;-). Second, it is what you would also do with type classes (which is logic programming in disguise). It is clearly not very efficient, one could use an (Elpi, or even OCaml) balanced tree during reification and convert it back to a Coq list at the very end, but I believe that it would be too much for an example like that. More in general my take is that if you really really need speed in meta level code, you write it in OCaml (you may prototype it in your favorite extension language... but that is all).

view this post on Zulip Christian Doczkal (Jul 07 2020 at 09:24):

@Enrico Tassi , thank you very much for the detailed explanation. I'm not worried about the speed of reification. The decision procedures I would be interested in are all worst-case exponential, so if reification were to take noticeable time, the decision procedure would time out anyways.

view this post on Zulip Janno (Jul 07 2020 at 13:41):

Here are several Mtac2 versions of the elpi code: https://gist.github.com/Janno/b0556cca413568096a27192dfcd3ac82 (now updated to remove some outdated comments). Version 2 (Mtac_V2 in the code) corresponds most closely to the elpi code, I think. The last version, Mtac_V3 uses a certifying reification metaprogram and typed tactics to avoid the unification cost of calling the change tactic. (It's also just awesome to know that the reification tactic is actually correct and not just coincidentally correct on whatever limited examples one has tried.)

view this post on Zulip Janno (Jul 07 2020 at 15:22):

The full tactic implementation is pretty long, due to the large number of operations, but maybe discovering monoid operations up to conversion would be a nice example too.

The way I implemented things in Mtac2 above one can actually fine-tune the amount of conversion that is performed when looking for the monoid operations by using different mmatch branche types. =u> and =c> perform full unification (UniCoq and Coq's own Evarconv, respectively) with reduction,=n> performs no reduction at all, and the default => performs reduction only in the discriminee, not in the pattern (and it will not instantiate evars in the discriminee)

view this post on Zulip Enrico Tassi (Jul 07 2020 at 15:23):

This seems unnecessary to me https://gist.github.com/Janno/b0556cca413568096a27192dfcd3ac82#file-elpi-reification-port-v-L212 . In the examples I have it is always the case that expressions outside the syntax are variables, but could be anything.

view this post on Zulip Janno (Jul 07 2020 at 15:25):

Ah, I wasn't quite sure about that part so I implemented something conservative. The code will not break when removing the check but using the list for non-var terms opens up a few question regarding the lookup function (do we search modulo conversion, for example).

view this post on Zulip Janno (Jul 07 2020 at 15:33):

I removed the check. Without examples it's hard to say how exactly to best adapt the lookup function(s) so I kept them as they are right now—which probably entails some difference V1 and V2&V3 since they use different mmatch patterns.

view this post on Zulip Enrico Tassi (Jul 07 2020 at 18:58):

This is related to the simplification rules. In a monoid you don't have rules like forall x, x - x = 0. In that case it would be more important to reify equals as equals, and for that you would search in the list up to conversion/unification rather than syntactically as I do in my code. I guess it is anyway interesting to point that out in the example and show how hard it would be to search up to...

view this post on Zulip Janno (Jul 10 2020 at 14:55):

More in general my take is that if you really really need speed in meta level code, you write it in OCaml

@Enrico Tassi I wanted to come back to this but it took me a while to figure out how to reply. I think if what you say is true then we've got a huge problem. Writing Coq plugins that actually work is extremely hard. I don't think I've tried a single (out-of-tree) plugin that didn't break pretty much immediately when I tried it. They all have various restrictions on what kind of terms they operate on, what environments they can be called from (Section tend to be problematic, for example), on which Coq settings they can tolerate/make use of (universe polymorphism, for example), etc.
And as a developer of a Coq plugin I can say that I would never, ever recommend anybody to write a Coq plugin unless it's absolutely necessary or the scope of the plugin is definable, defined, and very small. I think my response here may be slightly less extreme in the future since there is a lot of work going on that tries to make it easier to develop plugins, package plugins, deploy them on various platforms, etc. But I don't think it will ever reach a point where I'd consider it anything but the very last resort.
There is some truth to what you say, of course, which is that manipulating ASTs in OCaml will always be at least a bit faster than doing it through some kind of abstraction. I don't remember ever seeing an example of this really being the bottleneck, though. In my experience, typechecking is usually a much bigger factor, and so is actual computation (in reflective proofs, at least). But that's just my own experience.. Maybe we should start collecting examples of different kinds of bottlenecks so that we can have a more concrete discussion about this.

view this post on Zulip Paolo Giarrusso (Jul 10 2020 at 18:08):

From outside, I agree with @Janno; using DSLs and high-level languages is one of the lessons from programming language research.
An expressive _and efficient_ high-level (metaprogramming) language can give _leverage_ to users, since it allows them to work faster. But if performance isn't sufficient for their use, the benefit is more limited.
Of course, the performance doesn't need to be there from the start, but you need a chance to get it later — which sometimes means designing for performance from the start.

AFAIU, Lean 4's design seems better here — they invested in making Lean efficient enough to implement most of Lean in it, to enable customizing as much as possible without needing native extensions. That's a bet, but it might enable the non-kernel code to evolve (and be customized) much faster.
Coq is moving in that direction while preserving compatibility, and that's great, but let's understand the goal and the tradeoffs.

view this post on Zulip Enrico Tassi (Jul 11 2020 at 13:56):

@Janno I guess my message was not very clear. I do like high level domain specific languages for customizing Coq and I recommend using this class of languages to extend Coq. But I also don't want to hide the fact that "high level" rarely comes for free (at least for Elpi).

To be concrete, if you take the MetaCoq or Ltac2 datatype for terms it is very similar to the ML one, and indeed you could (as Lean does) compile meta programs to the host language achieving almost identical performances (we are not there yet today, AFAIK). At least in the context of Coq, that is implemented in ML, I don't consider this approach "high level", since you could write almost the same code in ML (there are other cool things about MetaCoq, like the possibility to reason about the meta program, but you may not need it).
Elpi and Mtac2 are, to my eyes, more "high level" than that. For example they represent terms in such a way that variables are not integers, and that feature comes at a cost. Similarly they provide language primitives like unification that is necessarily more expensive than the syntactic matching you get in ML. And don't get me wrong, I do believe it is totally worth it.

When I wrote my message I just wanted to be honest with the users (of Elpi) and I couldn't claim that Elpi is as fast as ML, or that it will be. Hence my suggestion to use ML if one absolutely needs speed, at least as things stand today.

view this post on Zulip Enrico Tassi (Jul 11 2020 at 14:03):

About the cost of type checking being dominant on anything else, It may be true and unavoidable in some cases (eg reflexive tactics), but in other ones it's just the lack of control that you have that make it expensive. For example it's very hard in Ltac1 (even with the "new" uconstr data type) to control when (read how many times) your terms are type checked. In ML it is simple, and a common mistake it actually to forget type checking your terms ;-) For example, last time I checked, the odd order theorem spends 1/3 time in type checking finished proof terms (Qed time) and 2/3 building the terms via tactics (written in ML). Given these numbers, I'm pretty sure that the ssr tactics don't type check terms too many times.


Last updated: Feb 06 2023 at 07:03 UTC