Interesting paper on automatic (lean) formalization of mathematics based on natural natural language processing.
I imagine this would be slightly harder for Coq, as the libraries (math-comp, corn, ...) have not been optimized to be human readable.
For those unfamiliar with this new trend, let me quickly summarize some of the recent work. Last May, Google Research along with researchers at Cambridge working on Isabelle AI figured out that one can use a large language model to translate theorem statements from informal to formal math.
Unlike a lot of machine learning work, this doesn't require training your own models. All one needs is (1) API access to OpenAI's Codex model (specifically the version
code-davinci-002) which is what also powers Github copilot, and (2) about four or so examples of the sort of translation you are trying to do. One then feeds those examples to Codex along with the theorem statement you are trying to translate. The authors also showed that while the translation is only perfectly correct about 25% of the time or so, it is good enough as a source of extra training data for neural-theorem proving models which are trained with reinforcement learning.
While that work was in Isabelle, it is quite clear that this work could extend to any other ITP with at least a little bit of code on Github. A few Lean folks at CMU quickly picked up this work, replicating it for Lean (which is easy) and turned it into a chat bot, where not only can you ask it to translate a theorem statement, but you can also have a dialog with it on how to fix the translation. They turned this into a VS Code plugin called LeanChat that anyone can use and play with.
Autoformalization with LLMs in Lean... for everyone! The chat interface for autoformalizing theorem statements in Lean built by myself and @ewayers is now publicly available as a vs-code extension. https://marketplace.visualstudio.com/items?itemName=hoskinson-ml.lean-chat-vscode https://twitter.com/zhangir_azerbay/status/1554511702091350018/photo/1- Zhangir Azerbayev (@zhangir_azerbay)
Autoformalization with LLMs in Lean! @zhangir_azerbay and Edward Ayers built a chat interface to formalize natural language mathematics in Lean: http://github.com/zhangir-azerbayev/lean-chat Very impressive work! https://twitter.com/Yuhu_ai_/status/1537514011427672064/photo/1- Yuhuai (Tony) Wu (@Yuhu_ai_)
While the plugin is specific to Lean, if you have API access to Codex, it would be easy to switch to Coq (or even a particular library of Coq like math-comp). The model is very tuned to the examples you give it along with the instructions, called a "prompt", and those examples and instructions can be found in prompting.lean.
After this work, the next natural question was "what about proofs?" I know of at least three very recent papers which address this:
All of them are interesting and exciting work on the ability to use Codex (again out of the box) to translate to Isabelle and Lean. I find the Draft, Sketch, and Prove paper particularly exciting. It gives a method to not only do autoformalization, but full theorem proving. The idea is as follows.
While I think their success rate of 43% on the mini-F2F benchmark might be helped by the fact that these problems are often found on the internet (in natural language form), I also think this is an interesting and compelling line of research.
Returning to Coq, I don't see a particular reason any of this can't be done in Coq. I do think the diversity of Coq libraries would present a challenge, but at the same time I think this could be overcome with good prompting. Also, as I said, since this doesn't require actually training models, all one needs is a Codex API key, some knowledge of "prompt-engineering", and a desire to play with this.
(While there is no perfect spot on the internet to discuss this type of research, a lot of discussion goes on in the "Machine Learning for Theorem Proving" stream of the Lean Zulip.)
I think most practitioners would agree that the formalization of structures and theorem statements are what is really interesting and useful. But since the model here will not be able to see a difference between notations and other definitions in Coq different libraries, it most likely can't target MathComp specifically (or other libraries like Stdpp)
(which means that code is not likely to be accepted by Coq out of the box)
on the other hand, last I checked, Coq has by far the most code on GitHub of any proof assistant (millions and millions of lines)
I do agree this type of auto-formalization is a bit different from the formalization most done in practice. I think currently this is mostly a toy so far. But it is getting constantly better, and half of that is better language models, while the other half is creative use of the current language models.
But since the model here will not be able to see a difference between notations and other definitions in Coq different libraries, it most likely can't target MathComp specifically (or other libraries like Stdpp)
You could be right, but at the same time, I wouldn't discount the power of prompt engineering. If it is an established library like MathComp, you could add to the prompt "translate to Coq MathComp" and put in MathComp specific examples. But yes, it seems that one of the main problems of these models is that it doesn't know the proper translation of a math term into the target language. The more esoteric the library, the more likely this is.
Last updated: Dec 01 2023 at 06:01 UTC