Hi folks, I asked ChatGPT to produce a group and monoid in Coq, that's what it got back to me XD:

```
Require Import mathcomp.ssreflect.ssreflect.
Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.
Structure Monoid : Type :=
{
carrier : Type;
op : carrier -> carrier -> carrier;
id : carrier;
associativity : associative op;
identity : left_id id op;
}.
Structure Group : Type :=
{
base : Monoid;
inv : carrier base -> carrier base;
inverse : forall x : carrier base, op (op base) (inv x) x = id base /\ op (op base) x (inv x) = id base;
}.
Coercion MonoidMixin (g : Group) : Monoid := g.(base).
Notation "1" := (id _).
Notation "x * y" := (op _ x y).
```

Not bad at all!

did you use "SSReflect" "MathComp" prompt engineering? Maybe even Hierarchy Builder...

Yes I asked it to do using math-comp style

First time I got actually what is a good suggestion for syntax XD

```
Structure Monoid :=
{
carrier : Set;
op : carrier -> carrier -> carrier;
id : carrier;
associativity : forall x y z : carrier, op (op x y) z = op x (op y z);
identity : forall x : carrier, op id x = x /\ op x id = x
}.
Structure Group := Monoid & {
inv : carrier -> carrier;
inverse : forall x : carrier, op (inv x) x = id /\ op x (inv x) = id
}.
```

Typescript-style XD

but looks like I would love to write this in Coq

seems weird that carrier is `Set`

?

Now writing some instances:

```
Definition IntAddMonoid : Monoid :=
{|
carrier := Z;
op := fun x y => x + y;
id := 0;
associativity := Zplus_assoc;
identity := Zplus_0_l;
|}.
Definition IntAddGroup : Group :=
{|
base := IntAddMonoid;
inv := fun x => -x;
inverse := fun x => (Zplus_opp_diag_l x, Zplus_opp_diag_r x);
|}.
```

Karl Palmskog said:

seems weird that carrier is

`Set`

?

Yes, after all ChatGPT has no clue about what it is doing XD

But I'm adding ChatGPT support to next coq-lsp release, we will see

Happy to add support for other people's systems by the way

isn't it better to hook into Copilot? nearly same model?

No idea @Karl Palmskog , happy to hook into whatever people think is fine

Most interesting for me is the button I just added "explain this proof"

It is not something meant to be serious, but to let people have fun and get ideas

I liked the "suggestion by tabbing" they did in the Copilot extension for VS Code. I really don't like services with signup, but this one was via GitHub, so one barrier less.

Yeah, def i don't like services with signup either

It seems to me ChatGPT is very good at following orders, but totally clueless about proofs

right, its training data has zero proof states

as for "following orders", I'm not sure the orders are exactly *your orders* as seen for example here

Interesting, thanks. So far whenever I have suggested it to take a correction on a proof, it has done more or less Ok, but indeed, it is just my very limited testing for a few weeks, that's why I want to add it to a wider interface

Asking it to explain the infinitude of primes proof in math-comp:

The statement Lemma prime_above m states that for every integer m, there exists a prime number p such that m < p.

The proof starts with the line have /pdivP[p pr_p p_dv_m1]: 1 < m! + 1. This uses the havetactical, which is used to introduce a new subgoal. The subgoal is to prove1 < m! + 1. The /pdivP[p pr_p p_dv_m1] part says that if the subgoal can be proven, then there exists a prime number p that divides m! + 1, and the pis stored in the variablep, and the proposition that pis prime is stored in the variablepr_p, and the proposition that pdividesm! + 1 is stored in the variable p_dv_m1.

The next line by rewrite addn1 ltnS fact_gt0 uses the by tactical to prove the subgoal 1 < m! + 1. The proof works by first adding 1tom! and then applying the ltnSlemma, which says thatn.+1 > nfor any integern. The lemma fact_gt0` states that the factorial of any positive integer is positive.

The next line exists p => //; says that there exists a prime number p such that m < p. The => // part says that the proof of this proposition is trivial, so it can be omitted.

The next line rewrite ltnNge; rewrites m < p to ~ (p <= m).

The next line apply: contraL p_dv_m1 => p_le_m applies the contraL lemma to the current goal, with the assumption p_dv_m1, to get a new goal that is the negation of the original goal. The => p_le_m part introduces a new hypothesis p_le_m, which states that p <= m.

The last line by rewrite dvdn_addr ?dvdn_fact ?prime_gt0 // gtnNdvd ?prime_gt1 finishes the proof. The dvdn_addr lemma states that if a divides b and c, then a divides b + c. The dvdn_fact lemma states that p divides m! if and only if p divides m. The prime_gt0 lemma states that any prime number is positive. The gtnNdvd lemma states that if p > n and p divides n, then p = n. The prime_gt1 lemma states that any prime number is greater than 1. The proof works by using these lemmas to conclude that if p <= m, then p cannot be a prime number.

Finally, the Qed command ends the proof.

The original proof for reference:

```
From Coq Require Import ssreflect ssrfun ssrbool.
From mathcomp Require Import eqtype ssrnat div prime.
(* A nice proof of the infinitude of primes, by Georges Gonthier *)
Lemma prime_above m : {p | m < p & prime p}.
Proof.
have /pdivP[p pr_p p_dv_m1]: 1 < m`! + 1
by rewrite addn1 ltnS fact_gt0.
exists p => //; rewrite ltnNge; apply: contraL p_dv_m1 => p_le_m.
by rewrite dvdn_addr ?dvdn_fact ?prime_gt0 // gtnNdvd ?prime_gt1.
Qed.
```

urgh, that's like a reading of the SSReflect lines with `Check`

lookups

I guess for people that don't know ssr syntax could be helpful?

Oh but indeed I've just realized in coq-lsp I have tons of dynamic info at hand, so I can generate a fairly interesting prompt

Could be a paper there XD

ah, but this is the problem, how do you do transparent/reproducible evals of nontrivial GPTChat applications unless you have an OpenAI co-author?

I guess you get an OpenAI co-author?

you can of course do experiments like this, but this is different from low-level tool evals.

For me ChatGPT helps because I can use it to test an interaction model of writing Coq proofs that I'd like to support well, independently of the engine. And of course, it can be useful to users.

based on the state of this repo, not sure they have many theorem proving people left...

Karl Palmskog said:

based on the state of this repo, not sure they have many theorem proving people left...

Indeed I'm not sure either; some key people in that area left OpenAI

this one also seems abandoned: https://github.com/openai/lean-gym (but probably there are living forks)

Tho the possibility to do studies exists, as `coq-lsp`

can almsot chat to ChatGPT and can almost run in `github.dev`

/ `vscode.dev`

so certainly you only need to recruit users, and have them clone a repos.

if all you aim for are user studies success, sure. But this limits scope quite a bit.

I'm not sure what OpenAI is doing, however I'm not sure either that the states of these repos means a lot. If they are working on some new stuff you will see it when it is released.

The explanation is pretty good but I guess ChatGPT got wrong contraL, ltnS, and I think the end's proving that p divides m! so it can't divide m! + 1?

And I wonder if the explanation overlaps with that on https://jscoq.github.io/wa/, which Google seems to rank pretty high whenever I google these lemmas... OTOH, I don't actually see much in common?

these state-of-the-art SSR proofs are one thing, but I'd like to see GPTChat make any sense whatsoever out of Coq 8.4 era semi-automated proofs... those are pretty accurate to describe as write-only

but I guess this is the main rift between Isabelle users and Coq users: "proofs must be understandable" (math proofs) vs. "proof scripts are doing tedious stuff and there's often not much value in looking at them" (PL proofs)

Karl Palmskog said:

these state-of-the-art SSR proofs are one thing, but I'd like to see GPTChat make any sense whatsoever out of Coq 8.4 era semi-automated proofs... those are pretty accurate to describe as write-only

Indeed here something like `coq-lsp`

can query intermediate goal state and put that in the prompt ;)

Last updated: Jun 11 2023 at 00:30 UTC