Stream: Coq Platform devs & users

Topic: CoqHammer


view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:20):

what's the general view on including CoqHammer in the Platform for 8.12 (or even 8.13)? Right now the author is focusing heavily on his 8.11 branch, so quite a lot of forward-porting would be required for 8.12. One issue might also be that CoqHammer itself is not very useful without also having the ATPs installed. On the other hand, CoqHammer is in my view undisputed in general automation for Coq (see IJCAR paper from this year)

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:22):

I would leave Coqhammer to a 8.12 platform delivery, which I plan to do reasonably soon after the releases (beta, .0)

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:22):

Michael Soegtrop said:

I would leave Coqhammer to a 8.12 platform delivery, which I plan to do reasonably soon after the releases (beta, .0)

I see, but do you intend to pick/recommend ATPs then also?

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:23):

This leaves the author time to adjust to 8.12 and to make a proper release for 8.12 - the main idea behind separating core Coq and platform releases. Right now we should concentrate on fixing what has always been delivers with Coq.

view this post on Zulip Théo Zimmermann (Jun 04 2020 at 09:23):

I see, but do you intend to pick/recommend ATPs then also?

This would have to be discussed with the author.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:24):

What does ATPs mean?

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:24):

OK, so "what has always been delivered" is essentially the Windows release? And then anything later extra is the Platform?

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:25):

Michael Soegtrop said:

What does ATPs mean?

Automated Theorem Provers. In CoqHammer's case, it's a special version of Z3, E prover, Vampire, CVC4

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:26):

@Karl Palmskog : yes, I would include at least some of the ATPs in the delivery.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:26):

Z3 I know how to build - the others I would have to look into.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:27):

@Michael Soegtrop but it's not the standard Z3, it's the TPTP backend which has a special build standalone build script

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:28):

Michael Soegtrop but it's not the standard Z3, it's the TPTP backend which has a special build standalone build script

Do you have experience with building the prerequisites of CoqHammer?

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:28):

TPTP is the language used by nearly all first-order logic automated solvers, and it's quite different from SMT2 used by standard Z3

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:29):

I see - anyway which ATPs would you consider to be essential for CoqHammer?

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:29):

Do you have experience with building the prerequisites of CoqHammer?

Well, I've done it a couple of times, but only for Linux

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:31):

Michael Soegtrop said:

which ATPs would you consider to be essential for CoqHammer?

I would say at least Z3 and Vampire. But basically what happens is all solvers are run concurrently, so it's hard to say which ones are most effective.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:36):

@Karl Palmskog : let us look into this together. Linux and OSX builds are typically not that different and the differences to cygwin I have lots of experience with. Of cause it depends on how cross build save the build scripts are - usually it is no problem if you specify the correct compiler upfront.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:37):

@Michael Soegtrop sure. I would probably start with Z3 then, since they seem to have really good cross-platform support there in general.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:41):

ah, from Czajka's IJCAR paper, it seems the new built-in automation (without any ATP) in CoqHammer (the so-called sauto/scrush tactics) also perform quite well on general formulas

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:42):

@Karl Palmskog : how are the numeric capabilities, say integer (a & 15)<16 and the like?

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:44):

And another question: do the tools CoqHammer use produce some sort of witness or does it assume via an axiom that the SMT solver is correct? I didn't follow the development for a while ...

view this post on Zulip Théo Zimmermann (Jun 04 2020 at 09:44):

They do produce full proofs.

view this post on Zulip Théo Zimmermann (Jun 04 2020 at 09:44):

The proofs are found by the SMT first, and reconstructed in Coq.

view this post on Zulip Théo Zimmermann (Jun 04 2020 at 09:44):

This is the main innovation behind SledgeHammer (for Isabelle).

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:44):

Michael Soegtrop said:

Karl Palmskog : how are the numeric capabilities, say integer (a & 15)<16 and the like?

I expect sauto/scrush to be bad at integer reasoning. For CoqHammer itself, it depends a lot on availability of the right lemmas. The system uses machine learning to find the right lemmas to use as axioms in ATPs. If you happen to get the right lemmas selected, it will work out. But to my knowledge there is no evaluation specifically on integer lemmas.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:46):

@Théo Zimmermann @Karl Palmskog I see, thanks. Can I upfront give it lemmas I know would work?

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:46):

Théo Zimmermann said:

The proofs are found by the SMT first, and reconstructed in Coq.

Nitpick: SMT solving like in Z3 and FOL solving using superposition like in Vampire are very different solutions to similar problems.

view this post on Zulip Théo Zimmermann (Jun 04 2020 at 09:47):

You mean I should say ATP to be more generic?

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:47):

I wonder how the interface between Z3's good numerical capabilities and say the lemmas of CompCert's integer arithmetic looks and if this exists at all.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:48):

But I guess it is definitely due time for me to explore this.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:49):

Théo Zimmermann said:

You mean I should say ATP to be more generic?

Yes, I'm saying ATP <> SMT, more like SMT \in ATP

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:50):

I currently do some things in plain Z3 actually (using the Python front end from MS) - it would be very convenient for me to merge this with Coq.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:50):

Michael Soegtrop said:

Can I upfront give it lemmas I know would work?

To my knowledge, there is right now no easy way to provide this via the CoqHammer plugin, but it calls an external predictor program which can be modified (has a known input/output format)

But the Coq-residing sauto/scrush tactics are carefully constructed so one can give them lemmas as hints (they are the ones which reconstruct ATP proofs inside Coq, given the knowledge of what lemmas the ATP used for its sucessful proof)

Finally, you can query the machine learning system from inside Coq on what lemmas will be provided to the ATPs as axioms.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:52):

To my knowledge, there is right now no easy way to provide this via the CoqHammer plugin, but it calls an external predictor program which can be modified (has a known input/output format)
````
I see. I guess I would have to do this if I would like e.g. to reason about CompCert's inetgers (which are based on Z but the word size limit leads to a complete new set of lemmas).

view this post on Zulip Karl Palmskog (Jun 04 2020 at 09:54):

By the way, p. 13 of the paper has some CoqHammer benchmarks on CompCert, if that is of interest

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:55):

@Karl Palmskog OK, I will look into this and start with Z3. Can you point me to the docs for building the specific variant? Or if you already have your own "setup CoqHammer with all what it needs" script, it would be a good start.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 09:58):

By the way, p. 13 of the paper has some CoqHammer benchmarks on CompCert, if that is of interest

Thanks, that is indeed useful - it gives at least some idea what to expect.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:08):

Michael Soegtrop said:

Can you point me to the docs for building the specific variant? Or if you already have your own "setup CoqHammer with all what it needs" script, it would be a good start.

I just pick a revision of the Z3 GitHub repo corresponding to a release, e.g., this one for this release and do in the root:

python scripts/mk_make.py

then if that's successful:

cd build
make examples

This produces an executable named z3_tptp which I renamed to z3_tptp_bin. I then created a small script z3_tptp:

#!/bin/bash
LD_LIBRARY_PATH=/opt/z3/lib z3_tptp_bin "$@"

where /opt/z3 is where I put the contents of the Linux zip of the corresponding release, e.g., https://github.com/Z3Prover/z3/releases/download/z3-4.8.8/z3-4.8.8-x64-ubuntu-16.04.zip

The very sparse documentation on the TPTP frontend is in examples/tptp/README, i.e., here, in the Z3 repo.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 10:14):

@Karl Palmskog : thanks this should get me started. Shall scripts as executables usually don't work on Windows, though, but it shouldn't kill me to write a C program corresponding to your shell script.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:16):

the error one gets if z3_tptp is run without knowing the location of z3 is as follows:

z3_tptp_bin: error while loading shared libraries: libz3.so: cannot open shared object file: No such file or directory

So maybe it's as simple as putting libz3 in the proper directory on the respective platform. I didn't want to install z3 globally.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:20):

I think the E prover was the simplest to build/install, it is known to work on Linux and macOS and Windows/Cygwin: https://wwwlehre.dhbw-stuttgart.de/~sschulz/E/Download.html

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 10:22):

So maybe it's as simple as putting libz3 in the proper directory on the respective platform. I didn't want to install z3 globally.

On Windows this would definitely be the approach to go for.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:23):

For Vampire, I just downloaded their Linux binary (https://vprover.github.io/download.html), I have no idea about building from source there: https://github.com/vprover/vampire

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 10:25):

Let me start with Z3 and then we see what else we can do.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:26):

Michael Soegtrop said:

On Windows this would definitely be the approach to go for.

It looks like "make examples" even produces the required libz3 file (libz3.so on Linux), so I guess there is no need to bother with the pre-built archives.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 10:28):

It looks like "make examples" even produces the required libz3 file (libz3.so on Linux), so I guess there is no need to bother with the pre-built archives.

The last half sentence I didn't understand.

view this post on Zulip Karl Palmskog (Jun 04 2020 at 10:38):

I just meant that what I did personally was unpack a standalone pre-compiled z3 archive like https://github.com/Z3Prover/z3/releases/download/z3-4.8.8/z3-4.8.8-x64-ubuntu-16.04.zip on my system. This archive contains libz3.so. However, doing make examples already produces libz3.so, so no need to bother with the precompiled released binaries, as long as one puts the self-compiled libz3.so in the right place.

view this post on Zulip Michael Soegtrop (Jun 04 2020 at 10:39):

I see. We anyway don't accept pre built binaries for Coq releases except MinGW libraries comming from cygwin.

view this post on Zulip Karl Palmskog (Sep 24 2021 at 15:12):

@Michael Soegtrop recall the discussion about Z3 in this topic. Unfortunately, we need a very special executable from Z3 for CoqHammer that does not get built by default

view this post on Zulip Michael Soegtrop (Sep 24 2021 at 16:29):

@Karl Palmskog : ah ok - I am getting old. Interestingly the hammer tactic (also the one for which I assume it uses external provers - there are two folders in the tutorial folder and both work). So how do I check if I did things correctly?

How about E prover? I just created an opam package for E.

view this post on Zulip Michael Soegtrop (Sep 24 2021 at 16:30):

Does E prover also require special build settings?

view this post on Zulip Michael Soegtrop (Sep 24 2021 at 16:37):

Trying to catch up on this ... so is it so that I need to build this (https://github.com/Z3Prover/z3/tree/master/examples/tptp) on top of the z3 library? That means I could use the standard z3 ocaml package (which installs the shared library) and then just build these few files on top?

view this post on Zulip Karl Palmskog (Sep 24 2021 at 18:03):

E requires no special settings in compilation, to my knowledge

view this post on Zulip Karl Palmskog (Sep 24 2021 at 18:04):

Michael Soegtrop said:

Trying to catch up on this ... so is it so that I need to build this (https://github.com/Z3Prover/z3/tree/master/examples/tptp) on top of the z3 library? That means I could use the standard z3 ocaml package (which installs the shared library) and then just build these few files on top?

This is my understanding. That is, one needs the basic z3 shared library available, and then one needs to compile a tptp executable on top

view this post on Zulip Michael Soegtrop (Sep 26 2021 at 13:06):

@Karl Palmskog , @Théo Zimmermann , @Enrico Tassi : Coq Hammer currently won't work on Windows and it doesn't look easy to fix. There are some easy to fix issues (path quoting in make, redirection to /dev/null in the plugin), but the main issue is that it makes use of Unix style forks in many places and there is no way this will work on Windows. As far as I can tell from a quick look these forks are not required, but it will still be work to fix it.

So I am against including it in the platform. Another issue is that the ATPs need some work to install on all platforms, although for eprover it was trivial to create a platform agnostic opam file, I guess z3 tptp will also be easy, but I didn't try as yet. I think we should help the author by handling the packaging of the provers (this way already making it easier to install via opam on macOS and Linux) and the Windows testing, but not include it until this is finished.

What is your opionion on this?

view this post on Zulip Karl Palmskog (Sep 26 2021 at 13:08):

@Michael Soegtrop as I said in the issue, the Hammer tactics (coq-hammer-tactics) are just Coq+OCaml. Including that package in the platform would still be a huge benefit, since Hammer users can rely on that their users (consumers of their Coq projects) have reconstruction tactics available in the platform - the hammer tactic calls are not supposed to be persisted in version controlled Coq sources anyway.

view this post on Zulip Karl Palmskog (Sep 26 2021 at 13:08):

basically, the firstorder tactic in Coq is nearly unmaintained, and sauto (included in coq-hammer-tactics) provides an alternative

view this post on Zulip Karl Palmskog (Sep 26 2021 at 13:10):

some of the Windows gotchas that were pointed out in this issue can probably be solved by simple means, such as installing via Dune to get the correct paths on Windows. But fine by me if this will be a more long-term goal to fix Windows for the coq-hammer package, i.e., we postpone including that package in the platform.

view this post on Zulip Karl Palmskog (Sep 26 2021 at 13:31):

I really want to see the whole hammer in the platform though, so I may go so far as throwing (donating) some resources at the problems for Windows

view this post on Zulip Michael Soegtrop (Sep 26 2021 at 13:37):

I have no objections against including coq-hammer-tactics.

Indeed it would be nice to have this in Coq Platform (and sorry for the delays - we just had other things to fix first). I think for 8.14 we should have enough time if we work together on fixing the Windows issues. For the 2021.09 8.13 extended pick, it is too late though.

We also need to work on Opam packages for the provers. As I said, I have eprover and I think I can do z3 TPTP easily.

view this post on Zulip Bas Spitters (Sep 27 2021 at 18:10):

Hijacking this thread. This paper contains a list of examples that they claim should be provable by coqhammer, but aren't currently.

view this post on Zulip Karl Palmskog (Sep 27 2021 at 18:23):

one related issue is here, and the bottom line seems to be that the sauto tactic is designed for backwards proof, while the goal they were interested in required forwards proof: https://github.com/lukaszcz/coqhammer/issues/77

so not sure this could be solved in a piecemeal way.

view this post on Zulip Karl Palmskog (Sep 27 2021 at 18:25):

the chance of getting 20M USD for Coq proof automation improvements seems slim.

view this post on Zulip Michael Soegtrop (Sep 27 2021 at 20:27):

@Bas Spitters : what prover backends did you use?

view this post on Zulip Karl Palmskog (Sep 27 2021 at 20:34):

@Michael Soegtrop the paper is by members of Chlipala's group at MIT. Seemingly, they tried to use both CoqHammer and SMTCoq, among other automation approaches. The CoqHammer issue just above is by one of the co-authors of the paper, and there he reported trouble with the reconstruction tactics.

view this post on Zulip Karl Palmskog (Sep 27 2021 at 20:35):

We experienced performance bottlenecks and inexplicable tactic failures connected both to logical theories with associated standard Coq tactics (e.g., linear arithmetic) and those without (e.g., bitvectors, finite maps, sequences). Tools like SMT-solver bridges to Coq [CoqHammer, SMTCoq] crashed or ran too slowly on too many of our goals (even months after reporting issues to their authors) to be viable.

view this post on Zulip Michael Soegtrop (Sep 27 2021 at 20:43):

Well isn't it always like this with SMT solvers? I know a few people doing HW verification with SMT solvers and they say the most difficult thing with using SMT solvers is knowing upfront how to design your tests so the the SMT solvers can do it. I am not sure this is a specific issue of CoqHammer. But sorry I didn't read the papers as yet, will do so.

view this post on Zulip Karl Palmskog (Sep 27 2021 at 20:46):

sure, but the reconstruction tactics are just a piece of a larger puzzle. Probably every level of tools like CoqHammer and SMTCoq need a lot of additional engineering effort to work for large scale stuff like they investigate in the paper. SMTCoq is to my knowledge not part of Coq's CI yet, so I worry about that it may not even be able to join the platform, here is where the discussion ended last time about packaging and CI and so on: https://github.com/coq-community/manifesto/issues/91

view this post on Zulip Karl Palmskog (Sep 27 2021 at 20:50):

if you want to believe Jasmin Blanchette, the right way to apply SMT-like automation may be to actually extend approaches like superposition solving (Vampire) to higher-level logics: https://matryoshka-project.github.io

view this post on Zulip Bas Spitters (Sep 28 2021 at 07:30):

It's not my paper. It's from the bedrock team. I happened to be
reading it, and they have a number of concrete suggestions.

view this post on Zulip Karl Palmskog (Sep 28 2021 at 07:40):

well, I read the general conclusion as: "we need ever more impressive demos to impress funders, and impressing funders is the only way to get more impressive demos, since we need to put serious engineering resources into a large number of bottlenecks"

view this post on Zulip Enrico Tassi (Sep 28 2021 at 08:24):

Tools like SMT-solver bridges to Coq [11, 15] crashed or
ran too slowly on too many of our goals (even months
after reporting issues to their authors) to be viable.

Politeness first...


Last updated: Jan 30 2023 at 11:03 UTC