Stream: Coq users

Topic: Tail-call optimization in reduction machines


view this post on Zulip Kazuhiko Sakaguchi (Apr 11 2021 at 23:01):

I expected that reimplementing a Gallina function in a tail-recursive form reduces memory consumption and execution time of vm_compute, but this seems not true according to my observation. Also, I found the following sentence in the ICFP '02 paper on vm_compute:

Consequently, we present a minimal subset of the ZAM, omitting several aspects of the real ZAM that are irrelevant for our purposes, such as the optimization of tail-calls, the use of a register to cache the top of the stack, ...

I don't get why it is "irrelevant". Can someone explain it? Also, is the situation different in native_compute? cc: @Maxime Dénès

view this post on Zulip Enrico Tassi (Apr 12 2021 at 05:17):

Cc @Guillaume Melquiond

view this post on Zulip Guillaume Melquiond (Apr 12 2021 at 05:44):

I think you are reading a bit too much in that sentence. It is "irrelevant" as far as this section of the paper is concerned, not in general. In fact, Coq contains all those features. For example, if you run the following, you will see that the recursive call is optimized (appterm instead of apply):

Set Dump Bytecode.
Fixpoint foo (l: list nat) acc :=
  match l with
  | cons h q => foo q (cons h acc)
  | nil => acc
  end.

That said, it is possible that the compiler in Coq is not as powerful as the one in OCaml and some optimization opportunities are missed. You should fill a bug report in that case.

view this post on Zulip Kazuhiko Sakaguchi (Apr 12 2021 at 06:07):

@Guillaume Melquiond Thanks! I will check my definitions with Set Dump Bytecode. Is there any other option or command that is useful to understand and debug vm_compute?

view this post on Zulip Guillaume Melquiond (Apr 12 2021 at 06:20):

There is also Set Dump Lambda, but the result is so close from the original Coq that it is kind of useless.

view this post on Zulip Kazuhiko Sakaguchi (Apr 15 2021 at 08:36):

After spending a few days for investigation and benchmarking, I could not identify the reason for my unexpected observation (benchmark results) mentioned above. So let me share my actual problem here.
I have four different implementations of mergesort. Two of them (CBN.sort and CBNOpt.sort) use the non-tail-recursive merge function, which should be suited to the call-by-need evaluation strategy. The remaining two (CBV.sort and CBVOpt.sort) use the tail-recursive merge function (as in List.stable_sort of the standard library of OCaml), which should be suited to the call-by-value evaluation strategy with tail-call optimization. Ones with Opt in their name try to reuse sorted chunks in the input (as in Data.List.sort of GHC) and thus should be fast if the input contains many chunks that are already sorted.
My benchmark results (NB: I'm not entirely sure if my code for measuring memory consumption is correct) show that:

  1. CBN variants are significantly faster than CBV variants when computing the first few elements of the output of sort using the lazy tactic (Figure 1),
  2. CBV variants are two times faster than CBN variants when extracted to OCaml (Figure 6), but unfortunately,
  3. there is no apparent difference when we compare them using vm_compute and native_compute (Figure 4 and 5).

If someone is willing to help me, I really appreciate it and suggest looking at the tail-recursive merge function.

view this post on Zulip Guillaume Melquiond (Apr 19 2021 at 14:08):

Unfortunately, in revmerge, you are creating a new closure whenever you process an element of x, which is costly. It would be much better if revmerge and revmerge' were mutually recursive rather than nested, but I do not know if that would still pass Coq's termination checker.

view this post on Zulip Kazuhiko Sakaguchi (Apr 21 2021 at 06:45):

Thanks! Let me try to fix it, but then, another issue is that Paramcoq often does not work for mutually recursive functions (at least in my experience) :sweat_smile:

view this post on Zulip Kazuhiko Sakaguchi (Apr 21 2021 at 07:18):

It seems to me that we cannot do this better unless we improve the termination checker (to recognize lexicographic orders) or vm_compute, but it was really good to know the source of my issue. Thanks again.

view this post on Zulip Matthieu Sozeau (Apr 21 2021 at 07:55):

Did you try a well-founded definition using a lexicographic order to compare?

view this post on Zulip Kazuhiko Sakaguchi (Apr 21 2021 at 08:00):

@Matthieu Sozeau No. I don't think it gives us Gallina terms that are efficient in vm_compute. Does it?

view this post on Zulip Matthieu Sozeau (Apr 21 2021 at 08:03):

I think it can give you reasonable if not top performance. You can fuel the accessibility proof so that the cost of a recursive call is not too high, but indeed it will still create closures I believe.

view this post on Zulip Kazuhiko Sakaguchi (Apr 21 2021 at 08:10):

OK. I will try that. Thanks.

view this post on Zulip Matthieu Sozeau (Apr 21 2021 at 08:19):

Do report here, I'd be curious to know of it fares!

view this post on Zulip Guillaume Melquiond (Apr 21 2021 at 12:26):

I agree that fueling the accessibility predicate would be just as costly as creating the nested fixpoints, if not more. (And not fueling it would be disastrous, performance-wise.)

view this post on Zulip Kazuhiko Sakaguchi (Apr 23 2021 at 06:39):

Progress report: I reimplemented the revmerge function so that it uses accessibility proofs instead of nested fixpoints. However, it seems to require a match expression returning a function to typecheck, which may cause a performance issue in vm_compute. Did I miss something? @Matthieu Sozeau

From mathcomp Require Import all_ssreflect.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Definition mergeord {T : Type} (p1 p2 : seq T * seq T) : Prop :=
  match p2 with
    | (x :: xs, y :: ys) => p1 = (xs, y :: ys) \/ p1 = (x :: xs, ys)
    | _ => False
  end.

Fixpoint wf_merge (T : Type) (xs ys : seq T) : Acc mergeord (xs, ys) :=
  if xs is x :: xs' then
    (fix wf_merge' (ys : seq T) : Acc mergeord (x :: xs', ys) :=
       if ys is y :: ys' then
         Acc_intro
           (x :: xs', y :: ys')
           (fun _ H =>
              match H with
                | or_introl H0 => eq_rect_r _ (wf_merge _ _) H0
                | or_intror H0 => eq_rect_r _ (wf_merge' _) H0
              end)
       else
         Acc_intro (x :: xs', [::]) (fun _ => False_ind _)) ys
  else
    Acc_intro ([::], ys) (fun _ => False_ind _).

Fixpoint revmerge
  (T : Type) (leT : rel T) (xs ys acc : seq T) (wf : Acc mergeord (xs, ys)) :=
  let: Acc_intro H := wf in
  match xs, ys return
        (forall xs', mergeord (xs', ys) (xs, ys) -> Acc mergeord (xs', _)) ->
        (forall ys', mergeord (xs, ys') (xs, ys) -> Acc mergeord (_, ys')) ->
        _ with
    | [::], ys => fun _ _ => catrev ys acc
    | xs, [::] => fun _ _ => catrev xs acc
    | x :: xs', y :: ys' => fun H1 H2 =>
      if leT x y then
        @revmerge T leT xs' ys (x :: acc) (H1 _ (or_introl erefl))
      else
        @revmerge T leT xs ys' (y :: acc) (H2 _ (or_intror erefl))
  end (fun xs' => H (xs', ys)) (fun ys' => H (xs, ys')).

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 06:57):

You could avoid creating closures there by just passing an equality to the match and rewriting, that might help in vm_compute

view this post on Zulip Kazuhiko Sakaguchi (Apr 23 2021 at 07:16):

Thanks. Should I also unfold eq_rect in the definition?

view this post on Zulip Kenji Maillard (Apr 23 2021 at 07:37):

A naive question: would passing directly wf to the match help instead of destructing it before the match ? (the resulting code is shorter but the inner match still returns a function type Acc mergeord _ -> _...)

view this post on Zulip Kazuhiko Sakaguchi (Apr 23 2021 at 07:59):

@Kenji Maillard IIUC, the recursive calls of revmerge should take x :: xs' and y :: ys' instead of xs and ys then, which is another source of a potential performance issue.

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 08:10):

I think eq_rect doesn't need to be unfolded. Matching on eq_refl is cheap hopefully!

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 08:12):

Actually, you might see a different performance difference it you use the Fix_F combinator that does inversion on the Acc proof "in the branch" instead of outside the branch as you do here.

view this post on Zulip Kazuhiko Sakaguchi (Apr 23 2021 at 08:42):

Currently, one that uses accessibility proofs is slightly slower and consumes 6 times more memory compared to the nested fixpoint version. I will put my code and benchmark results later. Also, unfolding eq_rect significantly improved memory usage.

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 09:46):

Aha, that's an interesting fact! Is it weak-cbv evaluation?

view this post on Zulip Pierre-Marie Pédrot (Apr 23 2021 at 10:20):

In general combinators as generated by Coq behave badly in call-by-value because they evaluate their arguments eagerly for non-recursive arguments.

view this post on Zulip Pierre-Marie Pédrot (Apr 23 2021 at 10:20):

Add atop of that the fact that they create closures when a pattern-matching would not and that's one more reason to expand them away.

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 10:29):

But for equality it probably does not make much difference if you evaluate the p : P x eq_refl and the eq_refl : x = x before unfolding, no?

view this post on Zulip Matthieu Sozeau (Apr 23 2021 at 10:30):

The problem is maybe coming from reducing the return predicate (but this should stop early if it's weak evaluation)

view this post on Zulip Kazuhiko Sakaguchi (Apr 26 2021 at 03:40):

New implementations (ones we were talking about are Module CBVAcc := CBV_(RevmergeAcc)): https://github.com/pi8027/stablesort/blob/489f630b2b20f1ad7e7307e6f2142635e7ccb32c/theories/stablesort.v#L132-L801
New benchmark results (see Fig. 7 for vm_compute): https://github.com/pi8027/stablesort/blob/8c8fb23e49c2a4c0b4fce6dc4a44dc441206c0f5/output/main.pdf

view this post on Zulip Kazuhiko Sakaguchi (Apr 26 2021 at 10:54):

Matthieu Sozeau said:

Aha, that's an interesting fact! Is it weak-cbv evaluation?

vm_compute performs strong reduction, but it's done by repeating weak reduction and readback. So I guess yes. I think that the point of fueling an accessibility proof is delaying the reduction of terms under Acc_intro. But currently, it rather seems to add some extra cost (it's not surprising though). Redefining wf_mergeord may improve the situation, but I have no concrete idea on this point.

view this post on Zulip Kazuhiko Sakaguchi (Apr 26 2021 at 10:57):

BTW, CBVAcc.sort3 extracted to OCaml seems to have a better performance than List.stable_sort (but excluding the cost of GC). This is really impressive.

view this post on Zulip Matthieu Sozeau (Apr 27 2021 at 07:18):

You might want to try the Acc fueler (Acc_intro_generator) at the end of https://coq.inria.fr/library/Coq.Init.Wf.html on top of wf_merge_ord to see if it helps.

view this post on Zulip Matthieu Sozeau (Apr 27 2021 at 07:25):

Can you post the extracted code too? I'm just curious :)

view this post on Zulip Matthieu Sozeau (Apr 27 2021 at 07:26):

One thing I don't understand in the bench is that extracted ocaml code seems to always be 10x faster than native compute, isn't it strange?

view this post on Zulip Kazuhiko Sakaguchi (Apr 27 2021 at 08:35):

Matthieu Sozeau said:

Can you post the extracted code too? I'm just curious :)

Yes, here it is. https://github.com/pi8027/stablesort/blob/bd0ac4f06a79318cf46d7c12a934971915310467/benchmark/ocaml/mergesort_coq.ml

view this post on Zulip Kazuhiko Sakaguchi (Apr 27 2021 at 08:44):

Matthieu Sozeau said:

One thing I don't understand in the bench is that extracted ocaml code seems to always be 10x faster than native compute, isn't it strange?

I don't think so. Firstly, the measured execution time of native_compute includes the cost of invoking ocamlopt. Since the input is now compiled before the measurement, it should be reasonably small, but we probably still have to compute the same thing many times in native_compute to reduce its impact on the benchmark result. Secondly, IIUC, native_compute has to pay some extra cost compared to extracted code so that it can readback the result, but the cost of readback itself should also be reasonably small since it computes sorted N.leb (sort N.leb xs). https://github.com/coq/coq/issues/8019

view this post on Zulip Matthieu Sozeau (Apr 27 2021 at 09:52):

It'd be fun to try certicoq on this example

view this post on Zulip Pierre Roux (Apr 27 2021 at 21:15):

native_compute also needs to handle the case of open terms: https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/native.20compute/near/231767343

view this post on Zulip Matthieu Sozeau (Apr 28 2021 at 10:16):

I would have expected that in this case the overhead of handling accumulators for the open term case was low but I guess that's wrong and we're still doing some reduction under contexts in this case, maybe for the Acc_intro proofs.

view this post on Zulip Kazuhiko Sakaguchi (Apr 28 2021 at 16:23):

Acc_intro_generator does not seem to improve anything. I'm not going to try CertiCoq by myself for now but would like to see its result too.

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 07:56):

I found the fact that the termination checker accepts:

Fixpoint merge (T : Type) (leT : rel T) (xs ys : seq T) {struct xs} :=
  (fix merge' (ys : seq T) {struct ys} :=
      match xs, ys with
        | [::], _ => ys
        | _, [::] => xs
        | x :: xs', y :: ys' =>
          if leT x y then x :: merge leT xs' ys else y :: merge' ys'
      end) ys.

but does not accept:

Fixpoint merge (T : Type) (leT : rel T) (xs ys : seq T) {struct xs} :=
  (fix merge' (T : Type) (leT : rel T) (xs ys : seq T) {struct ys} :=
      match xs, ys with
        | [::], _ => ys
        | _, [::] => xs
        | x :: xs', y :: ys' =>
          if leT x y then x :: merge leT xs' ys else y :: merge' T leT xs ys'
      end) T leT xs ys.

In the latter case, I imagine that the termination checker could be improved to recognize that:

  1. the argument xs of the inner recursion merge' does not change and is weakly decreasing, and therefore
  2. the argument xs of the outer recursion merge is strictly decreasing.

Is this another possibility to avoid creating closures? Meanwhile, Unset Guard Checking also seems to be an option to verify my ideas, although it is unsafe in general.

view this post on Zulip Michael Soegtrop (May 04 2021 at 09:23):

How about a single fixpoint using the sum of the lengths of both lists as measure?

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 09:47):

@Michael Soegtrop That requires traversing input lists twice and also another pattern matching on the measure. So I think it rather adds some extra costs for computation.

view this post on Zulip Michael Soegtrop (May 04 2021 at 12:52):

Ah yes, I keep forgetting that the measure proof is actually used during the computation - Guillaume explained it to me once, but the space for such details of logic is limited in a physicists brain. I demonstrated it in the below example, where I used an axiom for all proof obligations of the well founded induction. I hope I can remember it now.

With "traversing the input list twice", do you mean the additional traversal in the length terms of the measure proof?

I wonder what kind of extensions to the logic one would need to allow well founded recursion which does not need the measure proofs during recursion. It is not terribly elegant that it is not sufficient to have a proof that a fixpoint terminates but that one needs to evaluate this proof during recursion. I guess one cannot prove this with additional axioms, since axioms are usually not helpful for computation. For exporting proven correct algorithms this would be important.

From mathcomp Require Import all_ssreflect.
Require Import Program.

Axiom admit : forall {T}, T.
Obligation Tactic := exact admit.

Program Fixpoint merge' {T : Type} (leT : rel T) (xs ys : seq T) {measure (length xs + length ys)} :=
  match xs, ys with
  | [::], _ => ys
  | _, [::] => xs
  | x :: xs', y :: ys' =>
    if leT x y then x :: merge' leT xs' ys else y :: merge' leT xs ys'
  end.

Eval vm_compute in merge' Nat.leb [:: 1; 4] [:: 2; 3]. (* -> a long term containing "admit" *)

view this post on Zulip Enrico Tassi (May 04 2021 at 13:17):

Does extraction erase that argument?

view this post on Zulip Enrico Tassi (May 04 2021 at 13:17):

(I expect so)

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 13:25):

With "traversing the input list twice", do you mean the additional traversal in the length terms of the measure proof?

I thought it has to compute length xs + length ys but this is probably not true if we are talking about the Program Fixpoint command. But instead, it has to traverse xs and ys to produce the accessibility proof. (Right?)

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 13:28):

Then I have a similar variant of the merge function here, where everything is written by hand instead of using Program Fixpoint: https://github.com/pi8027/stablesort/blob/8c84308dbc2ecb1981a43972c7fa31f1fd29f030/theories/stablesort.v#L173-L233.
As @Enrico Tassi says, extraction erases the accessibility proofs: https://github.com/pi8027/stablesort/blob/b65c800c39911695c7b6ee3f48013e397c4133c5/benchmark/ocaml/mergesort_coq.ml#L277-L295.

view this post on Zulip Michael Soegtrop (May 04 2021 at 14:09):

Ah ok, extraction removes this. Still it would be nice if vm_compute would also remove it - I am a fan of computing stuff in Coq. If we have constructive logic, why not use it?

view this post on Zulip Enrico Tassi (May 04 2021 at 14:12):

I you prove the measure, and Defined it, then it should compute... Axioms have no computational content, even in a constructive logic ;-)

view this post on Zulip Matthieu Sozeau (May 04 2021 at 14:27):

Using the "fueling" of accessibility proofs (see the end of Wf.v), you can avoid any more computation of the measures/accessibility proofs

view this post on Zulip Matthieu Sozeau (May 04 2021 at 14:28):

What it should amount to is getting every Acc_intro in constant time, but it involves closure creation, so that would be the culprit in this case too

view this post on Zulip Matthieu Sozeau (May 04 2021 at 14:29):

vm_compute cannot remove it as you might be lying that you have an accessibility proof for the total relation for example, and in that case it could just not terminate

view this post on Zulip Michael Soegtrop (May 04 2021 at 14:39):

@Enrico Tassi : what I meant is that it shouldn't need to look at the proof at all for just evaluating the fixed point. So it shouldn't matter if the proof is an axiom or whatever.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:48):

This breaks a lot of stuff, e.g. if your accessibility proof uses an axiom you lose decidability

view this post on Zulip Enrico Tassi (May 04 2021 at 14:48):

Hum, there is something wrong in your intuition. Even if you just write {measure ...} the actual term contains a match on the proof term. That match is erased by extraction, since it is on a "proof" (a term like all the others ones, really, but labeled as Prop and not Type). This erasure has a justification (eg I think they prove it makes sense in MetaCoq) but it is not internalized in the logic. Sure, vm_compute or native_compute could do some erasure if the user asks so. But their results can also be used in proofs, as evidence. And the fact that your term admit is not erased is important. I don't know if it would make the system inconsistent, but would surely kill some models.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:48):

This is the reason why SProp doesn't allow elimination of Acc into Type, btw

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:49):

@Enrico Tassi allowing erasure is going to break SR

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:49):

also you're going to get weird behaviours if you start adding choice axioms

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 14:51):

What about improving the termination checker?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:53):

you're going to get a very powerful type theory

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:53):

probably not supported by any model

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:54):

after @Kenji Maillard 's discovery of guarded functions not valid in the exceptional model we already conjectuer that Coq's guard condition cannot be eliminated to combinators

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:54):

(for a good notion of "being able to eliminate")

view this post on Zulip Enrico Tassi (May 04 2021 at 14:56):

pointer plz

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:58):

a bar discussion in Nantes followed by whiteboarding

view this post on Zulip Enrico Tassi (May 04 2021 at 14:58):

ok, then tell me which round of drinks was it

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:58):

the problem lies in the fact that Coq considers empty matches to be subterms of anything

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:59):

but in the exceptional model, empty matches are not empty

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:59):

so if you were able to eliminate this guarded function to eliminators you'd get a non-terminating term

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 14:59):

obviously this only happens in an inconsistent setting but Coq semantic goes under binders so you're in trouble

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:00):

I can formalize the example easily, let me write that

view this post on Zulip Enrico Tassi (May 04 2021 at 15:01):

is the "exceptional" model important?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:01):

well, this shows that the guard condition is already stronger than eliminators

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:01):

so it depends if you see this as a bug or a feature

view this post on Zulip Enrico Tassi (May 04 2021 at 15:02):

looks more like a bug to me ;-)

view this post on Zulip Enrico Tassi (May 04 2021 at 15:02):

thanks for explaining

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:03):

the problematic term is Fixpoint dummy (n : nat) (e : False) {struct n} : unit := dummy (match e with end) e.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:05):

in the exceptional model it becomes something like

Inductive Falseᴱ := FalseE.
Inductive natᴱ := Oᴱ | Sᴱ : natᴱ -> natᴱ | natE : natᴱ.
Inductive unitᴱ := ttᴱ | unitE : unitᴱ.

Fixpoint dummyᴱ (n : natᴱ) (e : Falseᴱ) {struct n} : unitᴱ :=
  dummyᴱ (match e with FalseE => natE end) e.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:06):

obviously this is not well-founded

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:07):

since the exceptional model interprets these eliminators, it means that this cannot be obtained through eliminators, otherwise the exceptional model would be non-terminating, but it inherits this property from the target

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:08):

(this is not a formal argument, and also the exceptional model does not interpret Prop singleton elimination, so YMMV)

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 15:10):

I see your example is problematic but still do not get how it is related to my example.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:11):

commutative cuts are problematic in general, because they bring in some context

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:12):

it's the typical stuff not validated by combinators

view this post on Zulip Gaëtan Gilbert (May 04 2021 at 15:12):

the required equations on that fixpoint are dummy 0 e == dummy (match e with end) e and dummy (S _) e == same thing
this is satisfied by Definition dummy n e := False_rect unit e. no?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:15):

this example is a bit degenerated because it returns unit, but you can probably meddle with it so that you get a fixpoint that doesn't exist in CIC

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:16):

(well, the fixpoint exists extensionally because you're inconsistent but it computes in a way that CIC doesn't allow)

view this post on Zulip Gaëtan Gilbert (May 04 2021 at 15:17):

"probably"

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:22):

what saves you here is that you don't observe the n argument, but if you start using it you'll get some nasty commutative cuts

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:23):

the guard condition internalizes these commutative cuts, but they don't hold in all models, that's just my point

view this post on Zulip Matthieu Sozeau (May 04 2021 at 15:23):

One thing that could improve the current situation is this: if you start from a closed term (no lying about accessibility proofs), then it is fine to use CertiCoq and compute a weak-head-normal form of your term, after erasure. It will be guaranteed to terminate and give the same results as doing a conversion check with vm_compute (assuming your term is in an inductive type and, hereditarily, the arguments of the constructors are inductive (no functions)).

view this post on Zulip Matthieu Sozeau (May 04 2021 at 15:24):

I expect that for reflection proofs this could be quite competitive with the existing vm/native conversions.

view this post on Zulip Matthieu Sozeau (May 04 2021 at 15:24):

In the subset where you don't need "strong" reduction

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:24):

@Enrico Tassi : I understand that this is so in Coq, but I am not enough of a logician to understand why it has to be like this. I understand that recursive functions need termination proofs, but I don't understand why one can't have not too complicated setup e.g. using lazy evaluation of proof terms where the proof terms are never evaluated when they don't occur in the final result for a specific case. Of cause I don't want to use axioms - this was just an example to see what happens and how the evaluated term looks like without embedded complicated proof terms. As I said, I am a bit naive in such things, as I never studied logic.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:24):

a.k.a. geometric logic

view this post on Zulip Matthieu Sozeau (May 04 2021 at 15:26):

@Michael Soegtrop even with lazy evaluation you do need to look at the accessibility proofs to gradually peel Acc_intro constructors at each recursive call. The source of this is that (fix f a := b) c should reduce only if c 's head is a constructor (or evaluates to one). This check is no longer needed in the erased evaluation.

view this post on Zulip Enrico Tassi (May 04 2021 at 15:29):

I don't think it has to be that way, but it is how type theory is wired up :-/

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:30):

yes, I see that well founded recursion is defined like this - I just wonder if there are other ways to define it, so that the termination proof can be more easily abstracted away.

view this post on Zulip Enrico Tassi (May 04 2021 at 15:32):

No that I know of. Naive question: what would break if we "internalize" Markov principle in this very specific case (I mean, you define something by Acc and you also get the erased term)?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:32):

this is something I wondered about

view this post on Zulip Kazuhiko Sakaguchi (May 04 2021 at 15:32):

Pierre-Marie Pédrot said:

commutative cuts are problematic in general, because they bring in some context

Still do not fully understand it, but it looks related to the fact that the latter example here is obtained by lambda lifting, and also computing with the former example creates closures. Thanks anyway.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:33):

(ftr it shouldn't be called Markov's principle, it's rather indefinite description or something)

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:34):

you probably lose decidability of type-checking if you don't keep the term around, but if you keep it you have to be sure that you get meaningful reduction rules

view this post on Zulip Enrico Tassi (May 04 2021 at 15:34):

Pierre-Marie Pédrot said:

this is something I wondered about

I think it looks very natural (and easy to implement) :grinning_face_with_smiling_eyes:

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:34):

keeping the accessbility term around or not?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:35):

(you have to check it's axiom-free still)

view this post on Zulip Enrico Tassi (May 04 2021 at 15:35):

you only need to know which variables are in scope, no?

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:35):

the erased term will not behave the same as the term with the annotation

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:35):

so when you expand the fixpoint it's not clear that you'll get something that still type-checks

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:35):

The reason I am after this is that I think something like a Coq based Computer Algebra System is not that far away. Coq is efficient enough for this and not verified math tools are sufficiently painful. In such a use case it would be a problem if defining efficient computing recursive functions is difficult.

I don't think extraction would help me - I need to much of Coq's stuff to do my work. Extracting partial functionality won't help me.

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:36):

@Michael Soegtrop there are two very different problems though

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:36):

either you just want to evaluate a function in general, or you want to evaluate it and plug the result in CIC

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:37):

in the first case you can go through erasure or extraction, but in the second you need to be much more clever

view this post on Zulip Pierre-Marie Pédrot (May 04 2021 at 15:37):

(hence the VM overhead)

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:38):

Usually I have the second case: proofs about stuff which is partly computational and partly abstract.

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:41):

I could extract certain parts of my proofs and compute them outside of Coq, but why would I do this if computing inside Coq takes 30s?

view this post on Zulip Michael Soegtrop (May 04 2021 at 15:46):

Think e.g. of constructive reals: it is hard enough to make this efficient without having termination proofs in the path of computation.

view this post on Zulip Matthieu Sozeau (May 04 2021 at 16:02):

I think my proposal, which does not require to modify the theory would help in that use case @Michael Soegtrop

view this post on Zulip Matthieu Sozeau (May 04 2021 at 16:03):

Think of CertiCoq as a variant of vm_compute which works in just the "closed term" subset but gives you computation on entirely erased terms (not only proofs but also types get erased, which might drastically reduce the size of the terms)

view this post on Zulip Matthieu Sozeau (May 04 2021 at 16:03):

I would still be integrated in conversion like native and vm_compute are.

view this post on Zulip Matthieu Sozeau (May 04 2021 at 16:04):

So no need to "get out of the system"

view this post on Zulip Michael Soegtrop (May 04 2021 at 16:47):

@Matthieu Sozeau : I see - I thought CertiCoq was something for verified C export - is this part of it or did I get it wrong? Do you have some intro to CertiCoq I could read?

view this post on Zulip Matthieu Sozeau (May 04 2021 at 17:17):

I would suggest looking at chapter 2 of Zoe's PhD thesis http://zoep.github.io/thesis_final.pdf for example

view this post on Zulip Matthieu Sozeau (May 04 2021 at 17:18):

It is currently geared towards compilation to C and from there using CompCert but it can be applied in other settings. I wish I had the time to prototype a wrapper for the conversion test , but it's not looking like it will be ready tomorrow :)

view this post on Zulip Michael Soegtrop (May 05 2021 at 07:39):

@Matthieu Sozeau : thanks for the pointer! So as far as I understand this, this would work e.g. for a simplex algorithm in Q (see the discussion I had with @Xavier Allamigeon here : Zulip, especially gist on why one might want to do this). For a simplex algorithm in constructive reals I would (or rather might) end up with something which computes to arbitrary precision, but the Cauchy proofs would be stripped away, so that I can't use the results as real numbers any more. This might or might not be an issue.

Doing such things in Q is usually sufficient - doing it in constructive reals would just be an additional convenience and it depends on the achievable performance if one would do that. If one does this in Q, one needs a certain expertise in the decision how to round the involved reals to Q. Doing this in constructive reals would be fool proof - provided one has an implementation of a simplex algorithm which can handle the case that the ordering of two numbers is not decidable with reasonable effort. For that efficient well founded recursion would be helpful.


Last updated: Jan 29 2023 at 06:02 UTC