@Lasse as someone working in AI, do you see any obvious reason for why a machine learning dataset based on MathComp and related projects would be "deeply flawed" and of "less value" to an AI researcher than other comparable ITP ML datasets? I am asking because this was claimed recently here, specificallly [with my emphasis]:

However, AI works best with a database — and those databases are not yet there. Those that we have are

deeply flawedfor a number of reasons. For example, one mathematical database we have is a formalised contains a list of many many theorems about the smallest finite solvable group of odd order [...] The Gonthier et al work on the Feit–Thompson theorem is extremely important, because it is unquestionable evidence that the systems are ready to handle a 400 page proof, at least when when it is a proof which only involves low-level objects like finite groups. But to an AI researcher it seems to me thatthis database has rather less value.

From reading the entire paragraph, my understanding is just that Kevin says that this doesn't represent "modern mathematics". Note that his understanding of what is in math-comp might be limited to what leads to the Feit-Thompson theorem.

Also, he is specifically talking about computers proving theorems, whereas AI / ML can be useful in the field of formalized mathematics for many more purposes, like your work on formatting / naming shows.

And I guess what he means really is computers proving *new* theorems, which is a whole different level.

I'm not sure how this reading can be reconciled with the in my opinion very strong wording "deeply flawed" and "less value"

anyway, I'm going to comment on the blog post, but I want to hear the AI researcher view, since I don't do regular AI

would also be interested if MathComp developers agree with the contention that the odd-order project (transitively) only involves "low-level objects like finite groups". Are results about more sophisticated objects really intrinsically harder to do in the MathComp style of proof?

The post seems to ignore that the whole point of the OOthm was to develop a good library (Mathematical Components) which talks about real objects, not non-existing ones (as 1/2 of the first of the two books that compose the proof of the OOthm does).

@Enrico Tassi so basically, one can comfortably say that the whole MC library itself and a portion (small?) of the odd-order Coq project talk about existing objects?

The proof is made of two books BG* and PF* (if you look at the sources). IIRC the nonsense starts midway in BG (section 10 maybe). I did not work on PF, I don't know if it works under a false assumption.

What is blatantly incorrect is the claim that 20 researchers worked for six years to produce lemmas about things that do not exist.

The main effort in terms of design and engineering (and the largest part of the text output) is about MC, which does not assume a non existing group. The output was split into MC + odd-order exactly because the MC part was of general interest and the true value generated by the project.

Let me find a picture...

http://videos.rennes.inria.fr/Conference-ITP/slidesMathematicalComponents.pdf , slide 6. The last two books on the right are the odd order ones. The rest is not 100% formalized, but well... it gives you an idea

for the record, here are the relevant stats for MathComp 1.9.0 in our corpus:

- math-comp: 89 files, 84713 LOC, 8802 lemmas
- odd-order: 34 files, 36125 LOC, 367 lemmas

1/2 of BG is still about groups that exist, but it is very very technical, so we did not put it inside mathcomp/solvable...

Yes, I did recall numbers like these ( some measures are also in the slides I just linked, page 11 )

Anyway, since when the author of the post is knowledgeable in AI ?

Maybe training AI on nonsensical derivations is a good way to make the AI very complementary to human intuition. I was myself completely lost when the context became absurd, I would have loved some AI there ;-)

unfortunately, there are in my estimation very few people who are experts in all of: pure math, interactive theorem proving, AI/machine learning. Cezary Kaliszyk and Josef Urban come to mind, but I don't know precisely their pure math bona fides.

Enrico Tassi said:

Maybe training AI on nonsensical derivations is a good way to make the AI very complementary to human intuition. I was myself completely lost when the context became absurd, I would have loved some AI there ;-)

right, as Théo pointed out above, other forms of AI support than push-button proof automation (and the more mundane stuff on naming/formatting I am doing) are definitely possible and will hopefully be explored.

ah, Kevin did respond to my comment now (thanks, if you are reading this): https://xenaproject.wordpress.com/2021/01/21/formalising-mathematics-an-introduction/#comment-1923

He did invite me to respond. @Enrico Tassi can I use your comments above in the response to set the record more straight on the development of MathComp?

Sure, but I did write a short reply myself.

ah, it didn't show up yet, then I'll wait a bit

Are you saying that replies are moderated?

yes, my comment took maybe 2 hours to appear

Hilarious

for the record here, I have no further complaints after Kevin's clarifications in the comment section (and he changed the text a bit as well, although I have made clear that was not needed on my account).

Kevin responded to Enrico as well now: https://xenaproject.wordpress.com/2021/01/21/formalising-mathematics-an-introduction/#comment-1926

This thread seems to be largely resolved, but let me give some comments anyway. First of all, you are right that almost no-one is an expert in all of ML, theorem proving, type theory, classical mathematics, etc. I consider myself an expert in very few of those things, so I'm not sure how valuable my comments are. I'm mainly just a monkey flailing my arms around, trying to make something happen and sometimes succeeding :-)

As for the suitability of math-comp: I see no fundamental problem, as long as you keep the learning and evaluation within math-comp. Outside of that, learning from math-comp might not generalize very well, because it is written in ssr and because math-comp has a very particular way of expressing some things (like vectors) that other developments often do not use. But as long as you expect the AI to prove things within the math-comp universe, I think it should be fine.

As for the comments in the blog-post, even if all the objects in math-comp would be 'non-existent' (which you guys already debunked) I don't see why this is a problem. Proving the non-existence of things is often just as interesting as proving its existence. And in general, it is unreasonable to expect AI to generalize to entirely different domains (with current technology). For example, I would not expect an AI trained on your dataset to perform well on Compcert. But this is true of any dataset that does not contain all possible domains of mathematics. I guess I don't see the point of complaining about 'nonexisting' objects in that context (by that, I mean AI for theorem proving is still in its infancy and this is a rather minor thing to worry about; there are much bigger problems to worry about). But perhaps I'm misunderstanding the point he is trying to make.

Finally, let me note something about the dataset that I do think is a major flaw: As far as I can tell, there is no mechanized summary of the AST nodes that SerAPI generates. This is a problem because consumers will not know which node corresponds to which datastructure. I know that the CoqGym people had trouble with that (they wrote their own ad-hoc solution for this), and I can imagine many more projects would have similar problems. You could solve this by adding such a summary to your dataset:

```
(VernacExpr (VernacStartTheorem VernacSetOption ...))
(TacExpr (TacId TacML TacThen ...))
(Constr (Prod Var App ...))
...
```

Last updated: May 31 2023 at 10:01 UTC