Stream: Machine learning and automation

Topic: ideal API for driving backtracking proof-script search


view this post on Zulip Abhishek Anand (May 10 2024 at 02:44):

I have started working on a new project trying to write an LLM-augmented solver for iris-separation-logic proofs about C++ programs (converted to Coq by cpp2v). The solver would produce a Coq proof script, including helper lemmas and maybe even tactics (ltac1/2). This solver will do a backtracking proof search, similar to COPRA, but it would be different from COPRA in that instead of all proof steps coming from the LLM, most proof steps in the proof script will be produced algorithmically and LLMs would only be consulted as a last resort, e.g. loop invariants (similar in spirit to the lemur work presented at math-ai 2023 and also similar to alpha-geometry).

AFAIK, in COPRA, the main backtracking proof search engine is written in python and it communicates to coq using serapi, following proofbot. However, it seems serapi is now deprecated in favor of coq-lsp. I took a brief look at coq-lsp, which Coq-copilot also uses. It seems that a major difference between serapi and coq-lsp for the purpose of backtracking is that targets of backtracking (points to revert back to) are identified as sentence ids in serapi and by positions in coq-lsp. (My guess in the coq-lsp docs, positions mean character offsets in the .v file.) The coq-lsp approach seems a bit more problematic to me for the purpose of implementing algorithmic backtracking: what if a user (in vscoq) edits a comment before the backtracking target position, such that the length of the comment increases? would the driver need to increment the positions in the tree recording backtracking position targets? My guess is that in serapi, the sentence ids would not change in this case. It would be nice if coq-lsp provided some sentence-level ability to revert/backtrack.

One advantage of coq-lsp is that it is integrated into vscoq so that the user can interact with the prover engine, e.g. if the proof search fails (may take an hour though), the user can tweak things and/or add assertions manually. Has anybody implemented some kind of a backtracking proof-script search in coq-lsp? CC @Emilio Jesús Gallego Arias

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 07:16):

Hi @Abhishek Anand , indeed there has been quite a bit of work on that area based on coq-lsp's backend. I do agree that the experience (when it works XD) is pretty positive as things tend to integrate smoothly with vscode; for the purposes of this message let me remind a couple of names:

Indeed, Flèche uses document positions (file/column or offset) as the _ground_ truth the user and it have to agree in order to communicate. This is motivated by several needs of Flèche (like literate documents) and the fact that most often, documents sent to Flèche won't even parse (so hard to refer to the Ast of something you couldn't parse)

Moreover, generating these kind of "sentence ids" is actually not so easy.

My guess is that in serapi, the sentence ids would not change in this case. It would be nice if coq-lsp provided some sentence-level ability to revert/backtrack.

That's unfortunately not true, in SerAPI any edit will invalidate all the parts of the document after it.

Actually coq-lsp will correctly not re-run things in your example, so in this case, your tool would be able to observe that:

When documents can be edited by the user async, things are pretty complex, but coq-lsp provides a very nice advantage (IMHO) for users like you, and is that you can integrate your application with the incremental part of Flèche.

That is to say, if after a document edit the state you see is the same, coq-lsp won't call your tool (or in OCaml 5 hopefully it will let it continue to run)

But going back to your point about proof search, I fully agree: a document-based semantics is not the right framework for that, so we have been working on a new system targeted to RL and Coq. The system reuses quite a few parts of Flèche, but it is crutially based on proof state (so in this case the ids are not related to sentences, but to proof states)

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 07:17):

The new system is called petanque and I hope to have a first preview release this week.

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 07:17):

Basically in petanque you have run_tac : State.t -> string -> State.t Run_result.t

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 07:18):

Actually there are deep insights on how proof and documents interact, but that's for a different project.

view this post on Zulip Abhishek Anand (May 10 2024 at 14:29):

Thanks for the very informative responses.
Is it possible go back and forth between positions and goal/state ids in coq-lsp, e.g. functions to convert between the two? If not, would it be easy to add that support?

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 14:35):

I think converting between them doesn't make sense, but I'm open to consider possible schemes.

For example, imagine the tactic idtac.:

You can however use this kind of workflow:

Note that a good property of Flèche is that for the last step, Flèche will realize (if we enable it) that the selected tactic was already executed.

CoqPilot folks have been waiting for this funtionality to be deployed, it took a while due to silly unrelated problems, but I hope to get it integrated soon.

view this post on Zulip Abhishek Anand (May 10 2024 at 14:52):

Emilio Jesús Gallego Arias said:

You can however use this kind of workflow:

In my case, the proof script search is expected to take a long time (minutes to hours). I think it would be good to give the user some visibility into what is going on during that time. I would like to actively show the user what the proof search is doing at a time: e.g. show the "current" proof script in the file and the goals in the goal window. (there would probably be some button/keybinding to clone the current goal window so that the user can inspect it even when the proof search has moved on to another goal.) So inserting at the end of proof search would not be ideal for me.

For example, imagine the tactic idtac.:

Can we have an API where coq-lsp returns a list of pairs of type GoalID* position for each unique goal in the document?

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 14:57):

In my case, the proof script search is expected to take a long time (minutes to hours). I think it would be good to give the user some visibility into what is going on during that time.

Indeed, that would be great, but I'm not sure I see the model work unless you pretend the document is read-only? What do you do if the user changes some of the text that you started your job from?

Can we have an API where coq-lsp returns a list of pairs of type GoalID* position for each unique goal in the document?

We could, tho there are several definitions for goal, and indeed it would be expensive. In fact, we almost have it if you do:

You can use coq-lsp plugins to do similar stuff, for example, with some colleagues we wrote a baseline tool that reliably tries to re-do all proofs on a document with user-provided methods. That took just a few lines of OCaml + Flèche plugin API

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 14:57):

But indeed note that stuff like GoalID is not very relevant to LSP

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 14:57):

there are many goalIDs without a position in the document

view this post on Zulip Abhishek Anand (May 10 2024 at 15:13):

Emilio Jesús Gallego Arias said:

Indeed, that would be great, but I'm not sure I see the model work unless you pretend the document is read-only? What do you do if the user changes some of the text that you started your job from?

For now, I can ask the user to not change the file during proof search. But ideally, I should relax that in the same way vscoq/coqide already do: e.g. editing a Qed-marked proof above should be fine, as should be editing a comment.

view this post on Zulip Abhishek Anand (May 10 2024 at 15:17):

Emilio Jesús Gallego Arias said:

You can use coq-lsp plugins to do similar stuff, for example, with some colleagues we wrote a baseline tool that reliably tries to re-do all proofs on a document with user-provided methods. That took just a few lines of OCaml + Flèche plugin API

Thanks. Would you be able to share the code of baseline. Looking at it may helpful for me.

view this post on Zulip Abhishek Anand (May 10 2024 at 15:22):

Emilio Jesús Gallego Arias said:

there are many goalIDs without a position in the document

Can you give an example? Is it the case that if I have define Ltac apply2 := apply foo; apply bar and my proof script has apply2, there is a goal id for the goal between apply foo and apply bar? if so then yes I see that that goal does not correspond to a position.

It may still be helpful to have a coq-lsp API function to return a list of type GoalID*position, returning all the unique goals that actually correspond to a position in the file.

view this post on Zulip Emilio Jesús Gallego Arias (May 10 2024 at 20:27):

Abhishek Anand said:

Thanks. Would you be able to share the code of baseline. Looking at it may helpful for me.

Yes, we'd like to share it pretty soon I think. Note that this plugin operates using the compiler, so the .v file is fixed. Indeed, when a file is updated is similar to a git pull, so indeed it is up to you what to do. For now, coq-lsp / Flèche don't keep old versions of the document around (memoization works at a lower-level)

You can check this plugin as an example of how a Flèche plugin looks like; this plugin will trace all the goals at the sentence level.

I agree that such an API would be useful, the API is there already in OCaml, so it is a matter of providing a protocol binding, also it would be easy to be able to write this in Python directly (using PyML) but needs someone doing the example and the python init part.


Last updated: Oct 13 2024 at 01:02 UTC