Stream: Machine learning and automation

Topic: Chat GPT


view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:29):

Hi folks, I asked ChatGPT to produce a group and monoid in Coq, that's what it got back to me XD:

Require Import mathcomp.ssreflect.ssreflect.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Structure Monoid : Type :=
  {
    carrier : Type;
    op : carrier -> carrier -> carrier;
    id : carrier;
    associativity : associative op;
    identity : left_id id op;
  }.

Structure Group : Type :=
  {
    base : Monoid;
    inv : carrier base -> carrier base;
    inverse : forall x : carrier base, op (op base) (inv x) x = id base /\ op (op base) x (inv x) = id base;
  }.

Coercion MonoidMixin (g : Group) : Monoid := g.(base).

Notation "1" := (id _).
Notation "x * y" := (op _ x y).

Not bad at all!

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:30):

did you use "SSReflect" "MathComp" prompt engineering? Maybe even Hierarchy Builder...

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:30):

Yes I asked it to do using math-comp style

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:30):

First time I got actually what is a good suggestion for syntax XD

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:30):

Structure Monoid :=
  {
    carrier : Set;
    op : carrier -> carrier -> carrier;
    id : carrier;
    associativity : forall x y z : carrier, op (op x y) z = op x (op y z);
    identity : forall x : carrier, op id x = x /\ op x id = x
  }.

Structure Group := Monoid & {
    inv : carrier -> carrier;
    inverse : forall x : carrier, op (inv x) x = id /\ op x (inv x) = id
  }.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:30):

Typescript-style XD

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:31):

but looks like I would love to write this in Coq

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:31):

seems weird that carrier is Set?

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:31):

Now writing some instances:

Definition IntAddMonoid : Monoid :=
  {|
    carrier := Z;
    op := fun x y => x + y;
    id := 0;
    associativity := Zplus_assoc;
    identity := Zplus_0_l;
  |}.

Definition IntAddGroup : Group :=
  {|
    base := IntAddMonoid;
    inv := fun x => -x;
    inverse := fun x => (Zplus_opp_diag_l x, Zplus_opp_diag_r x);
  |}.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:32):

Karl Palmskog said:

seems weird that carrier is Set?

Yes, after all ChatGPT has no clue about what it is doing XD

But I'm adding ChatGPT support to next coq-lsp release, we will see

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:32):

Happy to add support for other people's systems by the way

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:32):

isn't it better to hook into Copilot? nearly same model?

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:33):

No idea @Karl Palmskog , happy to hook into whatever people think is fine

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:33):

Most interesting for me is the button I just added "explain this proof"

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:33):

It is not something meant to be serious, but to let people have fun and get ideas

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:33):

I liked the "suggestion by tabbing" they did in the Copilot extension for VS Code. I really don't like services with signup, but this one was via GitHub, so one barrier less.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:34):

Yeah, def i don't like services with signup either

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:34):

It seems to me ChatGPT is very good at following orders, but totally clueless about proofs

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:35):

right, its training data has zero proof states

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:36):

as for "following orders", I'm not sure the orders are exactly your orders as seen for example here

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:37):

Interesting, thanks. So far whenever I have suggested it to take a correction on a proof, it has done more or less Ok, but indeed, it is just my very limited testing for a few weeks, that's why I want to add it to a wider interface

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:39):

Asking it to explain the infinitude of primes proof in math-comp:

The statement Lemma prime_above m states that for every integer m, there exists a prime number p such that m < p.

The proof starts with the line have /pdivP[p pr_p p_dv_m1]: 1 < m! + 1. This uses the havetactical, which is used to introduce a new subgoal. The subgoal is to prove1 < m! + 1. The /pdivP[p pr_p p_dv_m1] part says that if the subgoal can be proven, then there exists a prime number p that divides m! + 1, and the pis stored in the variablep, and the proposition that pis prime is stored in the variablepr_p, and the proposition that pdividesm! + 1 is stored in the variable p_dv_m1.

The next line by rewrite addn1 ltnS fact_gt0 uses the by tactical to prove the subgoal 1 < m! + 1. The proof works by first adding 1tom! and then applying the ltnSlemma, which says thatn.+1 > nfor any integern. The lemma fact_gt0` states that the factorial of any positive integer is positive.

The next line exists p => //; says that there exists a prime number p such that m < p. The => // part says that the proof of this proposition is trivial, so it can be omitted.

The next line rewrite ltnNge; rewrites m < p to ~ (p <= m).

The next line apply: contraL p_dv_m1 => p_le_m applies the contraL lemma to the current goal, with the assumption p_dv_m1, to get a new goal that is the negation of the original goal. The => p_le_m part introduces a new hypothesis p_le_m, which states that p <= m.

The last line by rewrite dvdn_addr ?dvdn_fact ?prime_gt0 // gtnNdvd ?prime_gt1 finishes the proof. The dvdn_addr lemma states that if a divides b and c, then a divides b + c. The dvdn_fact lemma states that p divides m! if and only if p divides m. The prime_gt0 lemma states that any prime number is positive. The gtnNdvd lemma states that if p > n and p divides n, then p = n. The prime_gt1 lemma states that any prime number is greater than 1. The proof works by using these lemmas to conclude that if p <= m, then p cannot be a prime number.

Finally, the Qed command ends the proof.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:40):

The original proof for reference:

From Coq Require Import ssreflect ssrfun ssrbool.
From mathcomp Require Import eqtype ssrnat div prime.

(* A nice proof of the infinitude of primes, by Georges Gonthier *)
Lemma prime_above m : {p | m < p & prime p}.
Proof.
have /pdivP[p pr_p p_dv_m1]: 1 < m`! + 1
  by rewrite addn1 ltnS fact_gt0.
exists p => //; rewrite ltnNge; apply: contraL p_dv_m1 => p_le_m.
by rewrite dvdn_addr ?dvdn_fact ?prime_gt0 // gtnNdvd ?prime_gt1.
Qed.

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:40):

urgh, that's like a reading of the SSReflect lines with Check lookups

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:46):

I guess for people that don't know ssr syntax could be helpful?

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:47):

Oh but indeed I've just realized in coq-lsp I have tons of dynamic info at hand, so I can generate a fairly interesting prompt

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:47):

Could be a paper there XD

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:51):

ah, but this is the problem, how do you do transparent/reproducible evals of nontrivial GPTChat applications unless you have an OpenAI co-author?

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:54):

I guess you get an OpenAI co-author?

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:54):

you can of course do experiments like this, but this is different from low-level tool evals.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:55):

For me ChatGPT helps because I can use it to test an interaction model of writing Coq proofs that I'd like to support well, independently of the engine. And of course, it can be useful to users.

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:56):

based on the state of this repo, not sure they have many theorem proving people left...

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:57):

Karl Palmskog said:

based on the state of this repo, not sure they have many theorem proving people left...

Indeed I'm not sure either; some key people in that area left OpenAI

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:58):

this one also seems abandoned: https://github.com/openai/lean-gym (but probably there are living forks)

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 13:59):

Tho the possibility to do studies exists, as coq-lsp can almsot chat to ChatGPT and can almost run in github.dev / vscode.dev so certainly you only need to recruit users, and have them clone a repos.

view this post on Zulip Karl Palmskog (Feb 05 2023 at 13:59):

if all you aim for are user studies success, sure. But this limits scope quite a bit.

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 14:00):

I'm not sure what OpenAI is doing, however I'm not sure either that the states of these repos means a lot. If they are working on some new stuff you will see it when it is released.

view this post on Zulip Paolo Giarrusso (Feb 05 2023 at 15:16):

The explanation is pretty good but I guess ChatGPT got wrong contraL, ltnS, and I think the end's proving that p divides m! so it can't divide m! + 1?

view this post on Zulip Paolo Giarrusso (Feb 05 2023 at 15:17):

And I wonder if the explanation overlaps with that on https://jscoq.github.io/wa/, which Google seems to rank pretty high whenever I google these lemmas... OTOH, I don't actually see much in common?

view this post on Zulip Karl Palmskog (Feb 05 2023 at 15:37):

these state-of-the-art SSR proofs are one thing, but I'd like to see GPTChat make any sense whatsoever out of Coq 8.4 era semi-automated proofs... those are pretty accurate to describe as write-only

view this post on Zulip Karl Palmskog (Feb 05 2023 at 15:51):

but I guess this is the main rift between Isabelle users and Coq users: "proofs must be understandable" (math proofs) vs. "proof scripts are doing tedious stuff and there's often not much value in looking at them" (PL proofs)

view this post on Zulip Emilio Jesús Gallego Arias (Feb 05 2023 at 21:09):

Karl Palmskog said:

these state-of-the-art SSR proofs are one thing, but I'd like to see GPTChat make any sense whatsoever out of Coq 8.4 era semi-automated proofs... those are pretty accurate to describe as write-only

Indeed here something like coq-lsp can query intermediate goal state and put that in the prompt ;)


Last updated: Jun 11 2023 at 00:30 UTC